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I.  INTRODUCTION 


A.  BACKGROUND 

The  rapid  influx  of  powerful  microcomputers  has  provided  both  the  incentive 
and  capability  to  enhance  the  productivity  of  humans.  These  powerful  and 
inexpensive  workhorses  are  being  exploited  for  automating  routine  tasks,  acquiring 
and  communicating  information,  and  the  intelligent  support  of  decision  making. 
Of  major  importance  is  the  effort  to  enhance  the  productivity  of  humans  who 
control  these  machines  through  the  use  of  human-computer  interfaces  that  both 
maximize  human  performance  and  take  advantage  of  the  growing  capabilities  of 
these  computer  systems. 

It  is  estimated  that,  for  over  95  percent  of  human-computer  interactions, 
people  costs  are  greater  than  the  machine  costs  [Infotech  79].  Actions  that  reduce 
the  human  cost  and  simplify  the  human  interface  will  have  great  impact  on  the 
computer  industry.  A  technology  must  explore  these  interfaces  in  order  to  grow 
and  develop  to  its  full  potential. 

Many  forms  of  man-machine  interfaces  have  been  developed,  including 
cathode  ray  tube  displays,  printers,  keyboards,  joysticks,  etc.  However,  speech  is 
recognized  to  the  most  natural  and  fastest  form  of  human  communication,  and 
should  be  considered  as  an  interface  technique  for  system  optimization.  [LeFever 
87] 

Research  into  voice  recognition  (VR)  systems  has  been  ongoing  for  over  30 
years.  Research  into  decision  support  systems  (DSS),  which  evolved  from 
management  information  systems  over  15  years  ago,  now  is  maturing.  The  two 
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technologies,  which  until  now  have  matured  separately,  are  logical  candidates  for 
merging.  Thus  the  focus  of  this  study  is  the  application  of  voice  recognition 
systems  to  decision  support  systems.  A  Glossary  of  Terms  used  in  this  study  is 
provided  Appendix  A. 

B.  VOICE  RECOGNITION  SYSTEMS 

Voice  recognition  is  defined  as  the  ability  of  a  computer  or  other  device  to 
recognize  spoken  words  correctly  and  to  translate  them  into  a  predetermined 
output  string  to  the  computer  [LeFever  87].  Voice  recognition  is  also  called 
automatic  speech  recognition  and  by  other  names,  as  listed  in  Appendix  B.  It  is 
important  to  note  that  the  term  voice  recognition  refers  to  and  concerns  only 
command  input  via  the  human  voice.  It  does  not  include  computerized  voice  output 
or  speech  synthesis. 

There  are  many  advantages  to  using  voice  input  to  computer  systems.  In 

general,  a  voice  recognition  system: 

•  is  more  accurate  than  conventional  forms  of  input 

•  allows  for  concurrent  use  of  hands,  eyes,  and  other  senses 

•  allows  freedom  of  movement  from  a  specified  location 

•  can  be  used  in  low  light  or  dark  areas 

•  is  faster  than  conventional  forms  of  input 

•  promotes  the  use  of  the  computer  system  or  application  that  it  is  used 
in  conjunction  with 

•  is  easy  to  learn  and  easy  to  use 

•  promotes  productivity 

•  works  better  in  multilingual  environments  than  conventional  input 

•  works  equally  well  for  individuals  ranging  from  novice  typists 
through  expert  typists 

•  works  well  for  many  handicapped  individuals  [Poock  80,  Poock  81, 
Armstrong  80,  Baker  84,  LeFever  87] 
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Dobney  classifies  voice  recognition  as  "a  fifth  generation  language  or  more 
concisely  a  fifth  generation  concept."  [Dobney  87]  Voice  recognition,  along  with 
other  fifth  generation  concepts,  is  expected  to  be  critical  for  the  future  for  all 
computer  applications. 

C.  DECISION  SUPPORT  SYSTEMS 

There  is  no  generally  recognized  single  definition  of  decision  support  systems. 
The  definitions  in  use  cover  a  broad  spectrum  of  what  is  and  is  not  a  DSS  [Keen  87]. 

For  this  study,  the  following  definition  will  be  used: 

The  application  of  available  and  suitable  computer-based  technology  to 
help  improve  the  effectiveness  of  managed  decision  making  in  semi- 
structured  tasks.  [Keen  87] 

The  key  aspects  of  DSS  include: 

•  They  are  computer  based  systems. 

•  They  are  used  by  decision  makers. 

•  They  help  decision  makers  confront  ill-structured  problems. 

•  They  work  through  direct  interaction. 

•  They  utilize  data  analysis  models.  [Sprague  82] 

This  study  will  focus  on  the  fourth  aspect,  direct  interaction  between  the  decision 
maker  and  the  computer  system. 

The  basic  DSS  has  three  components:  data,  dialog,  and  models  [Sprague  82]. 
These  are  referred  to  as  the  DDM  paradigm  of  a  DSS  and  the  relationships  are 
illustrated  in  Figure  1.1.  The  importance  of  the  dialog  component  cannot  be  over¬ 
emphasized,  since  all  the  capabilities  of  the  DSS  must  be  articulated  and 
implemented  through  it. 
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Figure  1.1.  The  Dialog,  Data,  Model  Components  of  the  DSS 
Framework  [Sprague  82] 


This  dialog  component  consists  of  three  subcomponents,  as  illustrated  in  Figure 


•  The  action  language  is  what  the  user  can  do  in  communicating  with 
the  system. 

•  The  presentation  or  display  language  is  what  the  user  sees. 

•  The  knowledge  base  is  what  the  user  must  know  in  order  to  operate 
the  system.  This  can  take  the  fonn  of  help  menus,  reference  cards  or 
instructions,  a  user's  manual  or  information  that  previously  has  been 
learned. 


Action 

Language 

(What  you 
can  input) 


Presentation 

Language 

(What  you  see) 


Knowledge  Base 

(What  you  need  to  )tnow) 


Figure  1.2.  The  Dialog  System  User  Interface  [Sprague  80] 


This  study  primarily  considers  the  action  language  of  DSSs  and  its 
implementation  through  the  use  of  voice  input.  Secondary  consideration  is  given  to 
minimizing  the  size  of  the  knowledge  base  through  the  use  of  a  natural  language 
interface  and  by  optimizing  the  presentation  language  so  that  it  will  naturally 
encourage  and  prompt  proper  input. 

No  single  all-encompassing  or  overall  best  dialog  mode  presently  exits.  That 
is,  no  system  has  the  ability  to  handle  a  variety  of  human  interaction  styles,  shifting 
between  styles  at  the  user's  request.  Regardless  of  a  user's  experience  with 
computers  or  the  problem  or  tasks,  the  specific  dialog  mode  of  a  given  system  must 
be  learned  and  used,  in  order  to  use  the  system.  This  is  true  even  if  the  user  is 
already  familiar  with  another  dialog  mode  for  another  system. 

As  noted  by  Sprague,  "Dialog  will  profit  significantly  from  the  inclusion  of 
natural  language  processing  techniques  and  voice  recognition."  [Sprague  87] 
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D.  GOALS  AND  OBJECTIVES 

The  primary  objective  of  this  study  is  to  provide  a  current,  concise,  condensed, 
and  summarized  single  source  of  data  that  will  enable  selection  of  an  appropriate 
voice  recognition  application  for  a  given  decision  support  system.  In  essence  this  is 
a  non-automated  aid  for  making  voice  recognition  system  decisions  related  to  the 
design  of  an  automated  DSS. 

A  secondary  objective  is  to  provide  users,  developers,  researchers,  and  all 
others  concerned  with  voice  recognition  input  with  a  current  reference  guide  to 
voice  recognition  research.  Keywords  used  in  locating  references  are  provided  in 
Appendix  B.  This  guide  is  included  in  Appendix  C,  an  annotated  bibliography  of 
current  VR  literature,  with  subappendices  that  contain  references  to  the  annotated 
bibliography  by  functional  areas  of  DSS.  Appendix  D  furnishes  the  publishing 
source  of  all  literature  contained  in  the  annotated  bibliography  and  thus  facilitates 
retrieval  of  hard-to-find  articles. 

A  third  objective  of  is  to  provide  a  current  listing  of  all  available  voice 
recognition  systems  commercially  available.  This  list  is  contained  in  Appendix  E, 
along  with  information  concerning  compatibility  with  current  computer  systems 
for  these  voice  systems.  The  voice  recognition  systems  listed  include  a  wide  range 
of  capabilities,  and  are  useable  on  systems  varying  in  size  from  mainframe 
computers  to  desk  top  microcomputers. 

The  overall  goal  of  this  study  is  to  supply  a  useful  guide  for  decisions 
concerning  the  implementation  or  use  of  voice  input  for  decision  support  systems  as 
well  as  for  other  computer  applications. 
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E.  SCOPE  AND  METHODOLOGY 

1.  Scope 

This  study  primarily  considers  only  current  voice  recognition  literature, 
that  is,  books,  articles,  and  reports  that  are  less  than  five  years  old  (published  after 
1  January  1983).  A  limited  amount  of  older  literature,  determined  especially 
pertinent  and  worthy  of  note,  also  is  included. 

Keywords  used  in  searching  the  literature  are  listed  in  Appendix  B. 
Words  representing  voice  and  speech-related  topics  net  included  in  this  study  also 
are  listed  there.  No  experiments  or  case  studies  were  conducted  for  this  thesis. 

2.  Research  Methodology 

Exhaustive  research  was  conducted  to  identify  all  current  and  accessible 
voice  recognition  literature  and  voice  recognition  systems.  This  research  was 
conducted  using  Naval  Postgraduate  School  and  University  of  California,  Santa 
Cruz,  resources  and  via  locally  accessible  computer  networks. 

The  universe  of  papers  from  which  the  database  was  drawn  consists  of  all 
literature  that  contains  keywords  listed  in  Appendix  B.  Initially  over  10(X) 
references  were  located.  These  items  were  reviewed  and  filtered  to  determine 
those  applicable  to  DSSs.  As  a  result  of  a  review  process,  over  230  articles  were 
classiried  as  applicable  to  DSSs  and  are  included  in  the  final  database  in  the  form  of 
an  annotated  bibliography.  In  many  cases  diis  bibliography  also  contains  excerpts, 
abstracts,  or  summaries  of  those  articles  related  to  voice  recognition  that  are 
considered  to  be  useful  for  users,  developers,  researchers,  and  others  concerned 
with  voice  input  to  decision  support  systems. 
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II.  DATA  ANALYSIS 

A.  BACKGROUND 

As  fifth  generation  computer  technology  approaches,  the  use  of  "intelligent 
systems"  will  give  increasing  flexibility  to  the  input  devices  of  the  future.  The  data 
collected  for  this  study  provides  knowledge  needed  to  pick  the  best  method  of 
human-computer  interaction  for  the  specific  environments  of  a  given  DSS. 

It  has  been  proposed  that  speech  is  the  human's  highest  capacity  and  most 
natural  form  of  communications  [Lombardo  84].  Therefore  computer  voice 
recognition  would  be  the  most  natural  way  for  humans  to  interface  with  machines. 
The  problem  preventing  the  widespread  acceptance  of  VR  seems  to  be  that  most 
people  are  simply  not  aware  that  VR  exists  or  what  it  can  really  do  for  them. 

This  chapter  discusses  various  research  areas  or  categories  of  both  voice 
recognition  systems  and  DSSs.  Data  are  placed  into  several  categories  in  order  to 
facilitate  locating  answers  to  specific  problems  and  to  aid  in  performing  research 
related  to  a  specific  DSS  application  or  environment.  These  categories  were 
arrived  at  through  an  empirical  process  of  reviewing  the  reports  and  noting  logical 
trends  in  the  literature.  Each  research  area  is  related  to  an  Appendix  in  this  report 
containing  references  to  articles  germane  to  that  area. 

B.  HUMAN  FACTORS 

Categories  of  human  factors  included  in  this  study  are  (1)  stress,  (2) 
multimodality,  (3)  user  speaking  experience  level,  (4)  computer  experience  level, 
and  (5)  the  size  of  the  vocabulary.  These  topics  are  related  to  several  human 
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factors  areas:  occupational,  operational,  psychological,  physiological,  and 
personal.  [Yellen  83] 

Human  factors  is  discussed  first  because  of  its  importance.  No  matter  how  fast 
the  computer  is,  how  efficient  its  speech  recognition  algorithm  is,  or  how  pretty  its 
displays  are,  it  will  not  be  used  effectively  or  efficiently  unless  human  factors 
knowledge  applicable  to  system  implementation  has  been  reviewed  and 
incorporated. 

Appendix  Cl,  Section  1,  contains  a  listing  of  material  applicable  to  the  area  of 
human  factors.  Sections  2  thru  6  of  that  Appendix  include  references  that  are 
specific  for  each  category  within  the  scope  of  human  factors. 

1.  Stress  Related  Factors 

Stress  influences  the  sound  wave  frequency  of  an  individual’s  speaking 
voice.  Additionally,  stressed  speakers  often  appear  to  talk  in  longer  bursts,  with 
shorter  pauses  separating  the  bursts.  Psychological  stress  also  influences  an 
individual's  vocal  production  in  other  ways.  However,  there  is  no  consensus  in  the 
literature  concerning  how  stress  can  be  analyzed  to  predict  an  outcome. 

Stress  may  be  either  physiological,  psychological,  or  a  combination  of 
both.  Physiological  stress  is  more  clear  cut  than  psychological,  and  refers  to  the 
result  of  human  stresses  such  as  heat,  pressure,  electric  shock,  and  similar  stimuli. 
Psychological  stress  comes  from  many  sources  and  relates  to  an  individual's  ability 
to  cope,  adapt,  or  react  to  an  unfamiliar,  unfriendly,  or  threatening  environment, 
or  to  the  influence  of  that  environment  on  the  individual. 

Psychological  stress  can  be  further  subdivided  into  situational  and  self- 
induced  stress.  Situational  stress  is  the  influence  of  unfavorable  environmental 
factors  (excluding  physical  factors)  on  an  individual.  These  factors  are  beyond  the 
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individual's  control  and  may  include  circumstances  such  as  public  speaking, 
deadlines,  quotas,  etc.  Self-induced  stress  is  the  self-imposition  of  a  condition  or 
stimulus.  These  include  self-imposed  goals,  deadlines,  or  performance 
requirements  of  any  type  with  which  an  individual  forces  himself  to  function  above 
a  "comfortable"  or  "easy"  level  [French  83].  It  is  important  to  remember  that  in 
some  cases  it  may  not  be  possible  to  separate  physical  from  psychological  stimuli. 

Research  in  the  area  of  stress  and  voice  recognition  was  found  to  be 
limited.  References  are  listed  in  Section  2  of  Appendix  Cl. 

2 .  Multimodal  Factors 

Voice  recognition  systems  are  unique  in  their  ability  to  free  the  user's 
mind  and  eyes  for  carrying  out  visual  tasks.  A  voice  recognition  system  permits 
the  user  to  view  graphics,  screens,  and  decision  aids,  to  oversee  personnel,  or  to 
read  from  a  data  source  without  having  to  remove  the  eyes  in  order  to  communicate 
with  the  computer. 

Baker  states  in  her  keynote  address  to  the  First  International  Conference 
on  Speech  Technology: 

Just  as  Darwin  hypothesized  that  people  developed  spoken  rather  than 
gestural  language  so  as  to  free  up  their  hands  and  be  able  to  communicate 
in  the  dark  or  out  of  sight,  so  speech  recognition  has  seen  its  initial 
applications  in  "hands  busy,  eyes  busy"  applications.  [Baker  84] 

Voice  recognition  systems  promise  freedom  from  the  distraction  of 
interrupting  the  flow  of  work  to  recall  codes  and  find  keys.  Voice  recognition  can 
free  the  operator  from  having  to  remain  close  to  a  specific  physical  installation, 
such  as  a  video  display  terminal  or  keyboard.  Additionally,  the  use  of  a  wireless 
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microphone  permits  extensive  mobility  while  talking  to  computers.  French  states 
that 

Voice-input  could  enable  the  operator  to  continue  the  task  at  the 
terminal,  and  simultaneously  manipulate  a  visual  representation  of  the 
problem  they  are  involved  in,  for  others'  benefit.  This  is  a  potential  boom 
in  the  period  of  transition  from  a  symbolic  gestalt  to  an  era  of  much  more 
wide  spread  computer  literacy.  [French  83] 

As  cited  by  Yellen,  with  this  increased  mobility  also  come  increased 
problems;  breath  noise  can  now  create  a  serious  problem  [Yellen  83].  An 
individual  who  is  involved  in  little  or  no  physical  movement  while  engaged  in  voice 
recognition  can  obtain  very  high  recognition  accuracy,  but  errors  may  be  induced 
once  the  user  begins  to  move.  When  using  a  close-talking,  noise-concealing 
microphone,  inhaling  does  not  appear  to  cause  problems;  however,  exhaling  will 
produce  signal  levels  comparable  to  speech  levels. 

The  advantage  of  having  ones  hands,  eyes,  and  mind  free  to  perform  other 
tasks  could  be  the  major  contributing  factor  in  the  choice  of  voice  recognition  input 
to  a  computer  application.  This  multimodal  aspect  of  voice  recognition  enhances  or 
compliments  traditional  tactile  input  methods  rather  than  replacing  them  in  total.  A 
listing  of  literature  related  to  the  multimodal  aspects  of  voice  recognition  is 
contained  in  Section  3  of  Appendix  Cl. 

3.  Speaker’s  Experience  Level 

Many  studies  have  been  done  measuring  the  speaker's  experience  with 
voice  recognition  systems  and  the  resulting  quality  of  the  output  or  task 
performance.  The  research  in  this  area  is  referenced  in  Section  4  of  Appendix  C2. 

Most  studies  generally  agree  that,  regardless  of  the  initial  experience  level 
of  a  speaker,  novices  quickly  pick  up  voice  recognition  systems  skills  and  that  their 


perfonnance  improves  rapidly  toward  levels  of  experienced  users.  It  is  important 
to  note  that  professional  typing  skills  require  a  long  learning  period  and  diminish 
quickly  with  disuse.  On  the  other  hand,  speaking  is  a  natural  output  mode  for  the 
human  and  is  practiced  everyday  by  all.  The  user  has  only  to  restrict  spoken 
utterances  to  those  which  the  machines  can  recognize. 

4.  Computer  Experience  Level 

It  is  a  credit  to  the  adaptability  of  humans  that  they  can  use  today's 
software  when  so  much  of  it  still  abounds  with  such  non-memorable  commands. 
Complex  multiple  command/control/shift  keystrokes  often  are  required  which  can 
only  be  recalled  by  constant  and  experienced  users.  Conmiands  that  require  precise 
syntax,  spacing,  and  order  can  be  simplitled  by  the  use  of  voice  commands.  Once 
the  utterance  is  recognized  by  the  computer  it  is  input  correctly.  Long  commands 
or  passwords  which  require  accurate  input  and  multiple  keystrokes  are  easily 
mistyped,  but  can  be  input  accurately  with  a  voice  recognition  system. 

The  video  display  can  provide  directions  for  the  next  voice  input  through 
the  use  of  menus  or  with  a  graphical  representation.  This  can  be  of  special  value  to 
both  DSSs  and  Group  DSSs,  enabling  rapid  generation  of  "what  if  brain  storming 
or  alternatives  generation. 

Section  5  of  Appendix  Cl  provides  a  guide  to  publications  that  deal  with  a 
user's  computer  experience  level.  Many  techniques  are  listed  in  these  articles 
which  enable  better  performance,  given  a  specific  experience  level. 

5.  Vocabulary  Factors 

The  vocabulary  selected  for  a  voice  recognition  system  affects  the  speed 
and  accuracy  of  the  system  in  many  ways.  The  selection  and  structure  of  the 
vocabulary  is  extremely  important  to  the  success  of  the  system.  The  vocabulary 


should  be  as  natural  as  possible,  while  avoiding  conflicting,  confusing,  or  similar 
sounding  utterances. 

Most  current  voice  recognition  systems  perform  well  with  small 
vocabularies.  When  the  size  of  these  vocabularies  gets  large  (greater  than  1000 
utterances)  the  probability  of  error  increases,  along  with  the  processing  time.  The 
possibility  of  confusion  between  words  increases  with  vocabulary  size  also,  as  does 
the  probability  that  similar  sounding  words  have  been  included.  Better  speech 
recognition  systems  usually  have  recognition  algorithms  designed  to  reject  rather 
than  guess  at  questionable  or  similar  words. 

Humans  have  a  low  tolerance  level  for  waiting  for  machines  and  for 
machines  that  make  errors;  studies  show  that  humans  tend  to  abandon  systems  that 
perform  in  this  manner.  With  very  large  vocabulary  sets,  the  amount  of  data  to  be 
processed  for  each  recognition  is  intolerably  large  unless  coding  is  optimal  and 
optimized  comparisons  are  used.  Accuracy  is  increased  and  recognition  time 
decreased  by  using  vocabulary  subsets.  A  given  subset  usually  is  entered  by  saying 
the  subset’s  name  or  title  (also  called  the  node  word).  Once  in  this  subset  or  node, 
the  system  will  search  and  recognize  only  the  words  included  in  this  subset.  This 
increases  both  speed  and  accuracy,  and  allows  for  different  output  for  a  given 
input. 

For  example,  a  subset  of  numbers  may  be  entered  with  the  node  word 
"number".  Only  words  representing  those  numbers  contained  within  the  node  will 
be  recognized  (along  with  node  words  which  exit  the  subset).  This  allows  the  use  of 
homonyms  (such  as  "two"  and  "to")  without  confusion.  When  in  the  subset  of 
"numbers",  the  utterance  "to"  or  "two"  will  produce  an  output  of  "2".  When  in 


other  systems  the  utterance  "to"  or  "two"  will  produce  the  output  string  of  "to"  (or 
any  other  preprogrammed  output  desired). 

The  selected  vocabulary  can  also  be  used  to  overcome  problems  related  to 
cumbersome  program  commands  or  other  often-forgotten  commands  through 
allowing  for  various  input  utterances  to  result  in  the  same  output  string.  For 
example,  each  computer  network  has  a  specific  command  to  log  off  or  check  out  of 
the  system.  These  usually  differ  from  system  to  system,  and  it  may  be  difficult  to 
remember  which  is  required  for  each  system.  Programming  three  or  four 
different  utterances  that  produce  the  same  correct  output  command  will  alleviate 
this  problem  (e.g.,  "log  out",  "log  off',  "check  out",  and  "bye  bye"  might  all 
correspond  to  the  output  string  "LOGOFF  '^M";  saying  any  of  them  produces  the 
desired  result). 

Literature  related  to  the  area  of  speech  recognition  system  vocabularies  is 
referenced  in  Section  6  of  Appendix  Cl. 

C.  ENVIRONMENTAL  FACTORS 

The  environment  in  which  a  system  will  be  used  can  play  a  decisive  role  in  the 
choice  of  the  input  device  and  the  voice  recognition  system  to  be  used.  In  a  United 
Nations  command  center  that  is  dark,  noisy,  and  filled  with  people  from  many 
nations  with  varied  languages  and  customs,  typing  commands  to  a  computer  in  one 
language  in  a  fixed  syntax  is  not  practical.  A  well-implemented  voice  recognition 
system  can  do  this  job  faster  and  without  the  mistakes  normally  associated  with 
human  translators.  This  "Tower  of  Babel"  in  which  one  can  communicate  as  if  with 
one  tongue  can  be  implemented  with  current  technology  through  proper  design. 


References  to  environment-related  studies  and  research  are  found  in  Section  1 
of  Appendix  C2.  Subsets  of  these  references,  related  to  specific  environmental 
factors,  are  provided  in  Sections  2  through  6  of  Appendix  C2. 

1.  Multilingual  Factors 

The  UN  example  may  be  the  extreme,  but  in  this  world  of  instant  world¬ 
wide  telecommunications,  international  businesses,  and  melting  pot  nations, 
computers  frequently  must  interface  with  people  who  speak  different  languages. 
Voice  recognition  systems  are  unconcerned  with  what  language  is  spoken.  They 
operate  by  matching  the  pattern  of  a  given  voice  input  (utterance)  with  a  known 
pattern  and  then  outputting  some  predesignated  command  string,  therefore  acting 
somewhat  like  a  translator.  For  example,  three  languages  may  be  spoken  in  an 
office  (English,  Spanish,  Hindi).  The  computer  software  requires  input  in  English. 
It  is  impractical  to  teach  all  the  personnel  both  English  and  the  commands  required 
to  operate  the  computer.  A  voice  recognition  system  could  be  installed  that 
"understands"  utterances  in  all  three  languages  and  outputs  the  English  commands 
that  the  software  requires. 

Research  and  other  literature  related  to  voice  recognition  with 
multilingual  environments  is  found  in  Section  2  of  Appendix  C2. 

2.  Multicultural  Factors 

Multicultural  factors  arise  when  different  people  have  different  ideas, 
styles,  or  ways  of  doing  things.  All  computer  operating  systems  perform  similar 
functions,  but  there  are  subtle  differences  in  the  way  commands  are  activated.  For 
example,  for  a  simple  file  transfer,  the  UNIX  operating  system  uses  a  specific 
syntax  that  is  completely  different  from  that  used  by  an  IBM  operating  system. 


Switching  between  MS-DOS,  Z-DOS,  Apple  DOS,  and  the  Macintosh  operating 
systems  usually  will  require  the  user  to  look  up  the  desired  commands. 

Voice  recognition  systems  can  ease  these  difficulties  by  doing  the  lookup 
for  the  user;  the  same  phrase,  "save  and  quit",  can  be  programmed  to  produce  the 
same  result  on  all  systems.  Voice  recognition  can  also  help  equalize  the  varied 
experience,  training,  and  typing  skills  of  workers  or  executives  exposed  to  new 
systems  or  new  situations. 

Literature  sources  related  to  multicultural  factors  are  referenced  in 
Section  3  of  Appendix  C2. 

3.  Command  and  Control  Environments 

Military  establishments  have  done  much  work  toward  application  of  voice 
recognition  systems  in  the  command  and  control  environment.  The  result  of  this 
work  has  been  the  acceptance  and  implementation  of  operational  voice  recognition 
systems  in  both  strategic  and  tactical  command  and  control  environments.  Most  of 
this  research  can  also  benefit  civilian  business  and  industry  applications.  A  listing 
of  current  research  relating  the  areas  of  voice  recognition  systems  and  command 
and  control  is  provided  in  Section  4  of  Appendix  C2. 

4.  High  Noise  Environments 

Voice  recognition  systems  have  been  used  effectively  in  quiet  office 
environments  and  also  in  noisy  industrial  assembly  areas  (noise  levels  in  excess  of 
1(X)  db).  Although  voice  recognition  equipment  manufacturers  have  endeavored  to 
make  their  equipment  work  equally  well  in  both  environments,  there  are  some 
locations  where  it  is  still  too  noisy  for  voice  recognition  systems  to  operate  unaided. 
In  such  environments  the  use  of  a  soundproof  booth  or  a  mask  (such  as  a  noise- 
reducing  stenographer's  mask)  can  help;  external  noise  is  diminished  and  effective 
voice  recognition  can  take  place. 


Most  researchers  agree  that,  when  using  speaker  dependent  systems, 
"training"  voice  samples  should  be  collected  in  the  environment  in  which  they  will 
be  used.  This  is  especially  true  with  noisy  environments. 

Another  method  to  improve  voice  recognition  in  a  noisy  environment  is 
to  use  a  speech  enhancement  algorithm.  This  is  a  software  technique  used  to  clean 
up  the  speech  pattern  before  it  enters  the  recognition  device.  A  noise  concealing 
microphone  (like  those  that  have  been  used  in  aircraft  for  years)  also  can  be  used. 
This  microphone  samples  the  environmental  background  noise  and  aids  in 
canceling  out  this  background  noise  prior  to  its  being  sent  to  the  recognizer. 

When  noise  is  a  consideration  in  the  environment,  a  close  look  at  research 
in  this  area  is  critical.  Even  for  quiet  office  environments,  an  understanding  of 
noise  as  it  relates  to  voice  recognition  is  recommended.  Most  mechanical  things 
make  noise,  some  at  frequencies  that  the  human  cannot  hear  or  chooses  to  ignore 
due  to  familiarity.  The  noise  of  a  car,  aiiplane,  copy  machine,  or  elevator  during 
training  or  execution  of  voice  recognition  commands  can  result  in  puzzling 
problems.  Noise-related  articles  and  research  are  listed  in  Section  5  of  Appendix 
C2. 

5.  Low-Light  Environments 

Low-light  environments  include  both  dimly  lit  control  rooms  and 
completely  darkened  auditoriums.  In  these  environments,  lighting  can  interfere 
with  the  performance  of  the  operators’  primary  mission.  The  cockpit  of  an  aircraft 
and  the  bridge  of  a  ship  are  specific  environments  where  good  night  vision  is 
paramount.  During  daylight,  normal  manual  input  devices  are  adequate.  At  night. 


a  light  source  can  have  life-threatening  consequences.  A  voice  recognition  system 
allows  for  sightless  input  of  computer  commands  plus  mobility. 

Voice  recognition  systems  can  be  used  to  control  the  lights  in  a  room.  A 
more  complex  use  would  involve  a  microprocessor  voice  recognition  system  in  a 
welders  helmet  that  controls  the  welding  unit,  turning  it  on  and  off  and  also 
controlling  the  voltages  or  gas  flow  remotely. 

References  relating  voice  recognition  systems  to  low-light  environments 
are  listed  in  Section  6  of  Appendix  C2. 

D.  SITUATIONAL  FACTORS 

Situational  factors  covered  in  this  study  include  (1)  system  use  by  a  group,  (2) 
use  by  an  individual,  and  (3)  use  by  handicapped  persons.  Appendix  C3,  Section  1, 
provides  a  complete  list  of  voice  recognition  systems  references  related  to  such 
situational  factors. 

1.  Multiuser  or  Group  Usage 

A  multiuser  system  is  a  single  system  that  is  used  by  many  people  but  only 
one  at  a  time.  Group  usage  is  the  use  of  a  system  by  many  people  during  the  same 
time  period.  Both  multiuser  and  group  usage  have  similar  problems  and 
characteristics  and  have  thus  been  grouped  together  in  this  study. 

Multiuser-oriented  systems  can  be  either  speaker  dependent  or 
independent.  They  can  use  either  continuous  or  discrete  speech  recognition 

algorithms.  These  terms  are  defined  as  follows. 

•  Speaker  Dependent  Systems:  require  adaptation  (or  "training")  of  the  voice 

recognition  system  to  the  speech  characteristics  of  each  user  in  order  to 
achieve  recognition. 

•  Speaker  Independent  Systems:  recognize  speech  regardless  of  the  speaker, 

and  without  system  training  in  recognition  of  individual  speech 
characteristics  of  users. 


•  Continuous  Speech  Recognition:  the  process  of  extracting  information  from 

strings  of  words  even  though  the  words  run  together  as  in  natural  speech. 

[Yeller83] 

•  Discrete  (Isolated)  Speech  Recognition:  the  process  of  transforming  discrete 

utterances  (those  with  a  significant  pause  between  utterances)  into  computer- 

recognized  speech  or  text. 

Although  speaker  independent,  continuous  systems  are  better  suited  and 
require  less  training  for  multiple  users,  other  combinations  should  not  be  ruled  out, 
as  they  offer  some  advantages  in  specific  circumstances.  If  the  group  situation  also 
involves  environmental  factors  (such  as  in  a  multilingual,  high  noise  command 
post),  the  difficulty  of  selecting  a  system  is  compounded.  Speed  or  vocabulary  size 
or  robustness  may  dictate  that  a  speaker  dependent,  discrete  speech  system  be  used, 
even  though  system  training  time  is  higher  and  sampling  is  required. 

Implementing  voice  recognition  input  to  a  Group  Decision  Support 
System  (GDSS)  is  difficult  since  there  are  four  basic  GDSS  typologies,  each 
presenting  its  own  unique  problems.  Figure  2.1  shows  these  four  typologies.  [Bui 
87]. 

Figure  2.1  (a)  shows  a  bilateral  relationship  between  a  single-user- 
oriented  DSS  and  a  group  of  users,  the  later  being  considered  as  a  whole.  The 
purpose  of  such  a  DSS  is  in  essence  the  same  as  a  single-user  DSS.  [BUI  87]  In  this 
situation  a  voice  recognition  system  that  is  robust  enough  to  fit  the  needs  of  the 
group  is  required.  If  the  size  of  the  group  is  small  and  its  composition  constant,  a 
discrete,  speaker  dependent  system  (requiring  system  training  by  the  users)  is 
practical.  Otherwise,  a  speaker  independent,  continuous  speech  system  would  be 
most  appropriate.  With  a  varying  group,  the  cost  and  time  required  to  sample  and 
train  each  user  and  the  constraints  on  vocabulary  size  could  be  prohibitive.  Figure 


2.1  (b)  extends  the  previous  typology  to  include  a  GDSS,  and  has  the  same 
associated  problems. 


Figures  2.1  (c)  and  (d)  illustrate  a  multilateral  relationship  between  a 
member  of  a  group  (via  a  network  of  individual  DSSs)  and  a  GDSS.  This  typology 
allows  the  customization  of  individual  DSSs  to  suit  the  needs  of  us^s.  Currently 
the  cost  of  a  GDSS  of  this  nature  is  too  great  for  most  user  organizations; 
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centralized  or  off-site  facilities  (leased  from  or  provided  by  a  vendor),  used  by 
many  diverse  groups,  are  the  norm.  Requirements  for  minimal  training  time  and 
the  variability  of  users  usually  necessitate  the  use  of  a  robust,  speaker  independent, 
continuous  speech  system. 

There  is  no  perfect  solution  to  all  situations.  Each  installation  should  be 
evaluated  on  its  own  merit  by  well-informed  analysts.  Section  2  of  Appendix  C3 
provides  references  to  research  in  this  area. 

2.  Individual  Usage 

Voice  recognition  for  individual  usage  offers  the  greatest  possible 
number  of  options.  Many  factors  can  be  considered  when  optimizing  the  system, 
which  can  be  speaker  dependent  or  independent,  and  use  continuous  or  discrete 
recognizers. 

Voice  recognition  systems  can  also  be  used  to  augment  other  input 
devices.  They  can  be  used  simultaneously  with  keyboards  and  pointing  devices.  In 
the  fields  of  desktop  publishing,  graphics  manipulation,  or  computer-aided  design, 
the  task  of  entering  text  is  secondary  to  the  drawing  of  shapes  or  manipulation  of 
objects  on  a  screen.  A  voice  recognition  system  or  a  "talkwriter"  can  be  used  to 
perform  a  text  entry  task  and  thus  not  break  the  flow  of  canying  out  the  primary 
task. 

The  most  important  constraint  when  designing  a  system  is  the  time  and 
effort  required  for  training.  References  relating  voice  recognition  systems  to 
individual  users  are  provided  in  Section  3  of  Appendix  C3. 

3 .  Handicap  Situations 

A  physical  handicap  does  not  impair  a  person's  mental  ability  or  ability  to 
produce.  Just  as  a  person  with  an  amputated  leg  is  given  a  prosthetic  device  to  allow 
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mobility,  a  voice  recognition  system  can  be  used  as  a  prosthesis  that  can  compensate 
for  some  physical  handicaps.  Much  work  has  been  done  in  this  area  to  bring 
independence,  mobility,  and  productivity  to  the  handicapped.  Voice  recognition 
systems  not  only  can  be  used  by  the  handicapped  to  operate  computers,  but  they  also 
can  be  used  to  control  or  manipulate  other  mechanical  devices. 

Wheelchairs,  prosthetic  devices,  communication  devices,  environmental 
controls,  and  many  other  systems  may  be  controlled  via  the  voice.  The  highly 
individual  nature  of  designing  a  voice  recognition  system  for  the  handicapped  can 
result  in  the  use  of  small,  lightweight,  power  efficient,  portable  units,  fine-tuned 
for  the  user  and  his  or  her  needs. 

Research  related  to  the  handicapped  and  voice  recognition  is  located  in 
Section  4  of  Appendix  C3.  Much  of  this  research  is  equally  applicable  for  use  with 
non-handicapped  individuals. 

E.  QUANTITATIVE  FACTORS 

Some  of  the  benefits  or  advantages  of  computer  voice  recognition  systems  are 
subjective  (user  convenience  or  preference).  Other  aspects  are  undeniably 
quantitative.  These  include  response  and  task  time,  accuracy,  speed  of  entry,  ease 
of  use,  and  user  productivity.  References  that  evaluate  or  discuss  these  quantitative 
measures  are  found  in  Section  1  of  Appendix  C4. 

1.  Time 

Time  savings  can  be  measured  in  many  ways.  Baker  cites  data  from 
experiments  that  show  communications  via  typewriter  or  hand-writing  cannot  even 
approach  speech,  in  terms  of  time  or  task  efficiency  [Baker  84].  Time  saving,  in 
terms  of  hours  required  to  train  the  user  on  the  system  or  in  actual  hours  saved  by 
the  use  of  voice  recognition,  are  significant,  especially  in  common  environments. 
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As  voice  recognition  systems  become  commonplace  and  familiar,  the  time  saved  in 
training  personnel  is  expected  to  increase. 

References  in  the  area  of  response  and  task  time,  related  to  voice 
recognition  systems,  are  included  in  Section  2  of  Appendix  C4. 

2.  Accuracy 

One  of  the  selling  points  of  voice  recognition  systems  is  the  accuracy  of 
task  performance.  Once  an  utterance  is  correctly  "understood",  the  system  will 
produce  a  precise  and  correct  output.  However,  two  t>pes  of  errors  may  occur: 
rejection  and  misrecognition.  Rejection  is  the  inability  of  a  recognizer  to  classify  a 
utterance  correctly.  Misrecognition  happens  when  a  recognizer  classifies  an 
utterance  as  something  other  than  what  was  spoken.  Since  misrecognition  is 
potentially  more  serious,  most  good  recognizers  are  designed  to  reject  rather  than 
guess  at  marginal  pattern  matches. 

Experiments  have  shown  accuracy  rates  ranging  from  a  high  of  99.8 
percent  to  lows  in  the  range  of  88.6  percent.  The  accuracy  required  of  a  system 
depends  on  the  criticality  of  its  application  and  the  consequences  of  errors  in  the 
entered  data. 

Research  has  shown  that  183  percent  more  errors  occur  during  manual 
data  manipulation  (typing)  than  when  a  voice  recognition  system  is  used  [Yellen 
83].  Common  typing  errors  such  as  the  transposition  of  numbers  or  letters  are 
almost  eliminated  with  voice  recognition.  Correct  entry  of  numbers  is  especially 
important  since  automated  spelling  and  grammar  checkers  can  catch  most  letter 
transpositions. 

Voice  recognition  accuracy  can  be  improved  in  many  ways,  as  covered  in 
the  Training  Factors  Section  of  this  Chapter.  Briefly  stated,  recognition  accuracy 
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depends  primarily  on  how  the  equipment  is  trained  and  on  the  experience  level  of 
the  speaker.  Computer  experience,  time  of  week,  accent,  vital  capacity  and  rate  of 
air  flow,  speaker  cooperadveness,  and  anxiety  all  affect  accuracy  to  a  lesser  extent. 
References  providing  other  data  concerning  accuracy  are  included  in  Section  3  of 
Appendix  C4. 

3.  Speed  of  Entry 

Most  researchers  agree  that  speech  input  is  faster  than  keyboard  input. 
Most  individuals  can  speak  twice  as  fast  as  the  average  typist  can  type.  With  a 
greater  number  of  nontypists  gaining  access  to  computers,  faster  input  modes  are 
needed.  The  Macintosh  personal  computer  from  Apple  uses  a  pointing  device, 
pull-down  windows,  and  other  enhancements  (which  augment  the  keyboard)  to 
produce  a  more  natural  interface.  Experiments  evaluating  the  Macintosh's  pull¬ 
down  windows  in  comparison  with  continuous  voice  recognition  input 
demonstrated  a  distinct  advantage  in  using  continuous  speech  over  the  pull-down 
window  technology  of  the  Macintosh.  (Sweeney  86] 

In  other  research,  after  only  three  hours  of  training,  subjects  were  17 
percent  faster  using  voice  entry  than  typing  [Yellen  83]. 

References  concerning  task  completion  speed  are  listed  in  Section  4  of 
Appendix  C4. 

4.  Ease  of  Use 

Various  studies  have  been  carried  out  that  demonstrate  that  speech  input  is 
easy  to  learn  and  easy  to  use.  Users  also  develop  a  preference  for  speech  input  in 
time.  References  to  these  studies  are  located  in  Section  5  of  Appendix  C4. 
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5.  Productivity 

Computers  excel  in  performing  repetitious,  time  consuming,  and  boring 
tasks;  humans  do  not.  Thus  productivity  will  be  increased  when  such  tasks  can 
easily  be  turned  over  to  a  computer,  especially  if  voice  commands  can  be  used  to 
initiate  the  desired  operations. 

One  device  that  uses  a  voice  system  to  increase  productivity  is  the 
"talkwriter"  or  voice  dictator.  As  the  user  speaks,  words  are  recognized,  entered 
into  a  file,  and  displayed  on  a  screen.  When  more  than  one  interpretation  is 
possible,  the  system  may  provide  a  list  of  its  best  guesses;  the  user  selects  one. 
Better-developed  models  have  very  large  vocabularies  and  automatic  sentence 
punctuation. 

References  relating  voice  recognition  systems  and  productivity  are  listed 
in  Section  6  of  Appendix  C4. 

F.  TRAINING  FACTORS 

Training  of  the  user  and  the  voice  recognition  system  is  one  of  the  most 
important  considerations  in  the  effective  implementation  of  systems.  Methods  of 
training  depend  on  the  type  of  voice  system  being  implemented:  speaker  dependent 
or  independent  systems,  and  continuous  or  discrete  speech  systems.  Certain 
training  techniques  have  been  developed  that  can  improve  recognition  accuracy  and 
reduce  errors.  The  complete  list  of  references  to  training  is  found  in  Section  1  of 
Appendix  C5. 

1.  Speaker  Dependent  Systems 

Speaker  dependent  systems  require  that  samples  of  the  potential  user's 
voice  be  placed  in  computer  memory.  The  system  basically  is  tuned  for  each  user’s 
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voice.  Usually  these  systems  work  better  than  a  speaker  independent  systems 
because  the  dependent  system  contains  samples  of  the  actual  usei^s  voice.  [Poock  83] 
Speaker  dependent  systems  are  well  suited  to  situations  where  the  same 
users  perform  the  same  job  day  in  and  day  out.  However,  consistency  is  also  a  key 
element  in  successful  recognition  accuracy:  a  speaker  may  talk  quite  differently 
when  training  the  machine  than  during  operational  use.  Whenever  possible 
training  should  be  conducted  in  the  same  environment  as  the  equipment  will  be 
operated  in,  to  minimize  variability  that  may  affect  recognition  accuracy.  Other 
factors  that  affect  training  and  recognition  accuracy  are  age,  physical  condition, 
fatigue,  stress  (emotional  or  physical),  time  of  week,  breath  noise,  microphone 
placement,  familiarity,  illness,  peer  pressure,  workload,  and  external  noise 
changes.  When  changes  must  occur,  a  new  "training"  session  will  usually  retune 
the  system  and  restore  accuracy. 

Vocabulary  size  also  affects  recognition  accuracy.  As  familiarity  with  a 
voice  recognition  system  increases  and  the  vocabulary  is  expanded,  there  will  be 
more  utterances  that  sound  alike  or  similar  to  the  recognizer;  the  system  may  start 
to  reject  words  as  unrecognized  that  formally  were  accepted.  To  improve 
recognition  of  troublesome  words,  using  duplicate  words  trained  separately 
sometimes  will  increase  performance  of  that  particular  word. 

References  to  current  research  related  to  speaker  dependent  systems  are 
listed  in  Section  2  of  Appendix  C5. 

2.  Speaker  Independent  Systems 

A  speaker  independent  speech  system  contains  algorithms  that  can  handle 
many  different  voices  and  dialects.  The  system  is  designed  to  recognize  the  voice 
of  anyone  who  uses  it,  and  thus  is  useful  when  many  people  are  expected  to  operate 
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it  daily.  Unlike  speaker  dependent  systems,  speaker  independent  systems  do  not 
require  samples  of  a  given  user's  voice.  As  a  result,  speaker  independent  systems 
do  not  usually  perform  as  well  as  speaker  dependent  systems  that  are  tuned  to  a 
specific  user's  vocal  characteristics. 

Vocabulary  size  and  structure  play  an  especially  important  part  in  voice 
recognition  accuracy  with  speaker  independent  systems.  As  the  size  of  the 
vocabulary  increases,  the  possibility  of  confusion  between  words  also  increases 
since  there  is  a  greater  chance  that  there  will  be  similar  sounding  words. 

References  related  to  speaker  independent  voice  recognition  systems  are 
listed  in  Section  3  of  Appendix  C5. 

3.  Continuous  Speech  Recognition 

Continuous  or  connected  speech  recognition  systems  can  extract 
information  from  strings  of  words  even  though  the  words  run  together  as  in 
natural  speech.  Continuous  speech  is  much  more  natural  for  humans  to  use  than  is 
discrete  speech,  which  requires  pauses  between  utterances.  During  the  1970s,  most 
voice  recognition  systems  used  discrete  speech.  More  recently,  many  accurate  and 
inexpensive  connected  speech  systems  have  been  developed. 

Continuous  speech  systems  can  either  be  speaker  dependent  or 
independent.  They  usually  involve  larger  vocabularies  and  require  more  powerful 
computers  to  run  them.  "Talkwriter"  devices,  discussed  earlier,  are  connected 
speech  systems  with  very  large  vocabularies 

A  new  approach  to  continuous  recognition  moves  away  from  matching 
scheme  algorithms  to  more  flexible  "phonetic"  recognition  schemes.  Phonemes, 
the  basic  units  of  all  speech,  are  the  basis  for  phonetic  recognition.  This  type  of 
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system  is  trained  using  words  incorporating  all  combinations  of  phonemes.  The 
formulation  of  new  words  from  these  phonemes  then  is  possible. 

References  relating  to  continuous  speech  recognition  systems  are  listed  in 
Section  4  of  Appendix  C5. 

4.  Discrete  Speech  Recognition 

Discrete  or  isolated  speech  recognition  is  the  process  of  transforming 
discrete  utterances  into  computer-recognized  commands  or  text.  Discrete  speech 
contains  a  significant  pause  between  utterances.  A  discrete  speech  recognizer  must 
be  able  to  detect  a  pause  or  low  energy  gap  in  order  to  function.  Humans,  however, 
sometimes  find  it  difficult  to  speak  with  isolated  words  or  broken  phrases;  hence 
discrete  speech  is  not  the  most  natural  or  desirable  form  of  voice  recognition. 

Until  recently,  almost  all  commercial  applications  of  voice  recognition 
technology  have  been  discrete  voice  recognition  systems.  Discrete  systems  still 
offer  some  advantages  over  continuous  recognition  systems  in  the  areas  of  speed, 
accuracy,  and  especially  cost.  An  extensive  listing  of  currently  available 
commercial  voice  recognition  systems  is  contained  in  Appendix  E.  Usually,  unless 
a  system  is  advertised  as  being  continuous  or  connected,  it  is  understood  to  be  of  the 
discrete  variety.  References  contained  in  Section  5  of  Appendix  C5  provide 
additional  information  about  discrete  speech  recognition. 

5.  Recognition  Accuracy 

Training  plays  perhaps  the  most  significant  role  in  recognition  accuracy. 
Problems  often  arise  as  a  result  of  changes,  either  with  the  user  or  within  the 
environment.  A  computer  usually  is  much  more  sensitive  to  these  changes  than  is 
the  human.  An  impartial  observer  trained  to  detect  subtle  changes  and  who 
understands  the  mechanics  of  the  system  may  be  needed  for  trouble  shooting  and 
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system  repair.  For  speaker  dependent  systems,  a  simple  retraining  session  may 
restore  accuracy.  The  use  of  vocabulary  nodes  or  subsets  can  increase  both  si)eed 
and  accuracy  (see  the  Vocabulary  Factors  Section).  Duplicate  words  that  result  in 
the  same  output  string  may  minimize  rejection  (>roblems.  Increasing  the  word 
recognition  threshold  may  cause  a  higher  rejection  rate  but  can  minimize 
misrecognidon. 

Most  systems  come  from  the  manufacturer  adjusted  to  a  optimal  level; 
making  changes  may  only  decrease  performance.  The  operations  manual  gives  the 
best  guidance  to  how  this  manipulation  of  the  parameters  of  recognition  can 
improve  or  detract  from  recognition.  Publications  listed  in  Section  6  of  Appendix 
C5  provide  additional  information  on  recognition  accuracy. 

G.  HOST  COMPUTER  FACTORS 

Voice  recognition  systems  have  been  used  successfully  on  all  types  and  sizes  of 
computers.  Appendix  E  lists  current  voice  recognition  systems  and  describes  the 
host  computers  that  each  is  compatible  with.  Voice  recognition  has  also  been  used 
in  aircraft  and  spacecraft  control;  telephones;  robot  control;  in  teaching  people  how 
to  speak;  and  by  the  handicapped  to  control  body  limbs,  home  appliances, 
wheelchairs,  and  other  conveyances. 

As  voice  recognition  systems  mature  they  will  become  smaller,  cheaper,  have 
larger  vocabularies,  and  be  more  robust.  As  a  result  of  this  they  are  expected  to 
find  their  way  into  more  computer  applications  and  be  involved  in  more  aspects  of 
human  endeavor.  Section  1  of  Appendix  C6  provides  a  complete  list  of  references 
concerning  host  computer  applications  for  voice  recognition. 
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1.  Microcomputers 

Voice  recognition  systems  can  provide  input  to  microcomputers  via  many 
different  configurations,  both  internal  or  external.  External  "voice  boxes"  are 
perhaps  the  easiest  to  install  and  maintain.  They  are  self-  contained  units  that  may 
have  an  interchangeable  storage  medium  device  that  allows  for  swapping  or 
installing  vocabularies  or  software.  These  storage  devices  can  take  the  form  of 
floppy  disks,  tape  cartridges,  integrated  circuit  chip  cartridges,  compact  optical 
disks,  and  other  types  of  magnedc  and  optical  storage  devices. 

A  replacement  keyboard  is  one  simple  and  inexpensive  way  to  install  a 
voice  recognition  system.  These  systems  require  no  additional  space  or  alterations 
to  the  microcomputer,  they  draw  their  power  from  the  normal  keyboard 
connection,  and  have  ports  for  the  voice  recognition  microphone  and  related 
switches  built  into  the  keyboard.  Much  of  the  unique  voice  recognition  circuitry 
that  usually  is  installed  on  an  internal  microcomputer  board  is  in  the  keyboard.  The 
disk  storage  device  of  the  computer  is  used  for  its  vocabulary  and  other  software. 
Programming  this  type  of  system  is  easy  as  it  mimics  the  normal  keyboard 
keystroke  inputs.  Other  software  is  unaffected  by  the  system  and  is  unaware  that 
the  user  is  entering  commands  via  voice  rather  than  by  manual  keystrokes. 

Another  implementation  is  through  the  use  of  an  internal  plug-in  circuit 
card.  This  card  operates  in  a  manner  similar  to  that  of  the  keyboard,  with  the 
microphone  and  switches  plugging  into  the  card.  These  cards  may  incorporate 
other  functions  such  as  a  modem  or  speech  synthesis  unit. 

Some  voice  systems  are  actually  incorporated  into  the  basic  design  of  the 
microcomputer  and  are  internal  and  omnipresent  to  its  operation.  Specific 
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information  on  these  and  other  microcomputer  voice  systems  are  referenced  in 
Section  2  of  Appendix  C6. 

2.  Mainframes 

Mainframe  computers  may  be  accessed  by  the  same  types  of  methods  as 
those  noted  for  microcomputers.  Links  from  microcomputers  used  either  as  dumb 
or  intelligent  terminals  also  may  be  used  for  access. 

Because  of  the  powerful  processors  and  large,  fast-access  storage  devices 
associated  with  mainframe  computers,  much  research  has  been  done  with  voice 
recognition  related  to  large  computers.  Research  literature  concerning  mainframe 
computers  and  other  large  computer  applications  of  voice  recognition  systems  is 
listed  in  Section  3  of  Appendix  C6. 

3.  Networks 

Computer  networks  and  voice  recognition  systems  come  as  a  natural 
extension  of  microcomputer  and  mainframe  application  of  voice  recognition. 
Separate  vocabulary  nodes  or  specialized  vocabularies  may  be  used  when  accessing 
different  networks.  Passwords  and  entry  procedures  can  be  incorporated  into  the 
output  strings,  removing  much  of  the  drudgery  related  to  moving  through  a 
network.  The  implementation  of  speech  recognition  also  allows  the  use  of  voice 
verification  as  an  automatic  entry  and  access  device. 

Two  of  the  largest  networks  used  today  are  the  telephone  network  and  the 
automatic  teller  machine  networks.  Voice  recognition  systems  have  been  proposed 
for  these  networks,  and  development  efforts  are  underway.  References  related  to 
voice  recognition  and  networks  are  contained  in  Section  4  of  Appendix  C6. 


4.  Types  of  Entry  Required 

Data  entry  requirements  vary  from  application  to  application.  Voice 
input  can  be  used  to  collect  data,  as  in  inventory  control  or  quality  control  and 
assurance  situations.  Voice  input  can  be  used  to  input  data  or  information  into  a 
computer,  such  as  in  order  processing,  or  to  manipulate  data,  as  in  automatic 
message  preparation.  Voice  can  be  used  to  convert  speech  to  text,  as  in  the 
"talkwriter"  or  automatic  dictation  machines.  Voice  can  verify  data  that  has  been 
entered  by  others  or  that  has  been  mechanically  or  automatically  entered  via  some 
other  input  device.  Voice  can  be  used  to  control  industrial  processes,  machines, 
and  robots. 

Each  of  these  applications  requires  a  different  type  of  system  to  make  it 
work  optimally.  References  related  to  data  entry  systems  are  provided  in  Section  5 
of  Appendix  C6. 

H.  EXPERIMENTS  AND  RESEARCH 

A  vast  amount  of  research  has  been  conducted  in  both  broad  and  specific  areas 
of  voice  recognition.  Section  1  of  Appendix  C7  contains  references  to  this 
research.  This  research  is  further  divided  into  logical  groupings,  to  allow  focused 
study.  Section  2  of  this  Appendix  covers  research  in  the  area  of  artificial 
intelligence.  Section  3  looks  at  future  research,  that  is,  those  areas  in  which  new 
trends  are  developing  or  towards  which  research  is  predicted  to  move.  Section  4 
deals  with  present  research,  covering  work  done  in  the  last  five  years.  Section  5 
includes  literature  related  to  research  conducted  prior  to  1  January  1983.  Many 
experiments  and  case  studies  have  been  conducted.  Section  6  is  devoted  to  these. 

A  special  area  of  interest  has  evolved  relating  the  field  of  voice  recognition  to 
the  area  of  natural  language  interfaces.  Dobney  states  that  natural  language 
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interfaces  and  speech  recognition  are  fifth  generation  concepts.  A  natural  language 
interface  allows  a  user  to  express  his  or  her  request  in  English.  Certain  difficulties 
arise  when  using  naturally  spoken  English.  The  problem  is  related  to  the  use  of 
homonyms,  such  as  "I  heard  the  song"  and  "I  saw  a  herd  of  buffalo".  A  related 
difficulty  results  when  phrases  sound  similar,  such  as  "I  scream"  and  "ice  cream". 
[Dobney  87]  The  human  mind  has  developed  ways  to  sort  out  these  problems; 
humans  understand  the  context  of  what  is  being  said,  and  are  sensitive  to  shifts  in 
context.  Dobney  presents  some  interface  complexities  which  natural  language 
processing  must  address  and  resolve.  Some  of  these  are  listed  here  to  demonstrate 

the  scope  of  this  problem. 

•  Time  flies  like  an  arrow 

Fruit  flies  like  a  banana. 

•  You  wouldn't  recognize  Mary  now.  She's  grown  another  foot. 

•  Can  anyone  walk  over  Niagara  Falls  on  a  tightrope? 

•  A  sandwich  is  better  than  nothing. 

Nothing  is  better  than  a  good  square  meal. 

Therefore  a  sandwich  is  better  than  a  good  square  meal.  [Dobney  87] 

The  challenge  will  be  to  develop  machines  that  will  do  what  we  mean,  and  not 
necessarily  what  we  say.  Literature  documenting  research  dealing  with  natural 
language  interfaces  is  found  in  Section  7  of  Appendix  C7. 
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III.  RESULTS  AND  CONCLUSIONS 


A.  RESULTS 

The  primary  objective  of  this  thesis  is  to  provide  a  single  source  of  reference  to 
enable  the  selection  of  an  appropriate  voice  recognition  system  implementation  for 
a  given  DSS  or  other  computer  application.  Chapter  n.  Data  Analysis,  fulfills  this 
objective  by  providing  both  a  broad  overview  of  voice  recognition  systems  and 
their  characteristics  and  a  close-up  view  of  specific  categories  within  voice 
recognition. 

The  second  objective  is  to  provide  a  reference  guide  to  current  voice 
recognition  literature  and  research.  Appendix  C  is  such  a  guide.  It  contains  an 
aimotated  bibliography  and  has  subappendices  that  directly  link  this  bibliography  to 
specific  areas  of  research  that  are  discussed  in  Chapter  II.  An  additional  result  of 
this  study  is  Appendix  D,  a  complete  index  of  all  publishers  mentioned  in  the 
bibliography,  which  should  facilitate  retrieval  of  articles  that  might  be  difficult  to 
locate. 

The  third  objective  is  to  provide  a  current  listing  of  all  commercially  available 
voice  recognition  systems.  This  listing  is  contained  in  Appendix  E,  and  gives  each 
manufacturer's  name,  address  and  phone  number.  The  various  types  of  voice  input 
devices  manufactured,  their  intended  use,  and  their  compatibility  with  current 
computer  systems  also  are  provided  there. 

The  overall  goal  of  this  study  is  to  provide  a  useful  guide  to  help  in  the  decision 
making  process  concerning  the  implementation  or  the  use  of  voice  recognition 
systems.  Information  in  this  study  can  be  used  both  as  an  introduction  to  voice 
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recognition  systems  and  as  a  reference  source  to  answer  questions  on  specific 
topics.  The  direct  linking  of  specific  topics  to  a  grouping  of  articles  dealing  with 
this  topic  allows  use  of  this  study  as  a  ready  reference  source. 

B.  CONCLUSIONS 

As  discussed  in  Chapter  I,  the  dialog  component  of  decision  support  systems 
may  be  the  weak  link  when  implementing  a  DSS.  By  using  voice  recognition 
systems  to  optimize  this  dialog  component,  the  overall  DSS  will  benefit. 

As  noted  in  the  Voice  Recognition  Systems  Section  of  Chapter  I,  voice 
recognition,  as  well  as  other  fifth  generation  concepts  is  expected  to  be  critical  for 
the  future  of  most  computer  applications. 

Research  listed  in  the  Human  Factors  Section  of  Chapter  II  has  shown  that 
stress  may  result  from  a  fear  of  new  technology.  Fear  of  new  technology  is  not  a 
recent  phenomenon.  This  fear  of  voice  recognition  systems  often  is  a  result  of  the 
user  not  being  previously  introduced  to  such  systems.  Fear  also  can  result  when  the 
user  is  unaware  of  what  voice  recognition  can  actually  do  (and  cannot  do). 

Considering  the  importance  of  voice  recognition  and  its  proven  value  to  human 
productivity,  the  volume  of  recent  research  is  not  increasing  proportionally  to  its 
perceived  importance.  This  is  indicated  by  the  amount  of  literature  referenced 
throughout  Chapter  11.  The  volume  of  publications  has  not  increased  in  recent 
years  at  the  rate  of  studies  done  in  earlier  years. 
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I V .  RECOMMENDATIONS 


It  is  recommended  that  designers  and  users  of  DSSs  investigate  voice 
recognition  systems  as  a  means  of  optimizing  the  dialog  component  of  DSSs.  As 
noted  by  Ralph  Sprague,  describing  the  future  of  Decision  Support  Systems, 
"Dialog  will  profit  significantly  from  the  inclusion  of  natural  language  processing 
techniques  and  voice  recognition"  [Sprague  87]. 

As  the  reality  of  fifth  generation  computer  technology  approaches,  the  use  of 
"intelligent  systems"  such  as  natural  language  processing  and  voice  recognition 
systems  will  allow  for  both  flexible  and  natural  input.  Although  no  one  input 
method  is  perfect  or  even  appropriate  for  all  uses,  voice  systems  show  promise  for 
wider  applications  then  presently  are  being  implemented. 

Widespread  acceptance  of  computer  voice  recognition  can  be  encouraged  by 
proper  training  and  orientation  of  potential  users  of  such  systems.  A  good  training 
and  education  program  in  the  use  and  benefits  of  voice  recognition  will  help 
smooth  the  path  for  voice  recognition  implementation. 

More  research  is  needed  in  all  areas  of  voice  recognition.  Only  through 
continued  research  and  experimentation  can  voice  recognition  systems  develop  and 
improve.  The  perceived  recent  lull  in  voice  recognition  research  may  in  part  be 
due  to  normal  delays  in  the  publishing  process  or  to  recent  cutbacks  of  research 
funds.  However,  since  the  demand  for  better  input  methods  continues,  research 
must  also  continue. 

It  is  hoped  that  this  study  can  help  guide  and  inspire  the  use  of  voice 
recognition  systems  for  decision  support  systems  and  other  computer 
implementations.  A  tool  has  been  provided  that  can  enable  quick  reference  to 
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literature  related  to  specific  areas  of  concern  and  research  within  the  domain  of 
computer  voice  recognition.  Continued  education  and  enlightenment  should  result 
in  progress  and  greater  acceptance  of  these  systems. 
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APPENDIX  A  GLOSSARY  OF  TERMS 


Group  Decision  Support  System  (GDSS^:  a  computer-based  system  that  aims  at 
supporting  collective  {^oblem  solving.  A  collective  decision-making  process  can 
be  viewed  as  a  problem  solving  situation  in  which  there  are  two  or  more  persons, 
(1)  each  of  whom  is  characterized  by  his  or  her  own  perceptions,  attitudes, 
motivations,  and  personality,  (2)  who  recognize  the  existence  of  a  common 
problem,  and  (3)  who  attempt  to  reach  a  collective  decision.  [Bui  86] 

Decision  Support  System  (DSSl:  the  application  of  available  and  suitable 
computer-based  technology  to  help  improve  the  effectiveness  of  managed  decision 
making  in  semi-structured  tasks.  [Keen  78] 

Voice  Recognition  (VR^:  the  ability  of  a  computer  or  device  to  recognize 
spoken  words  correctly  and  translate  those  sounds  into  a  predetermined  output 
string  to  a  computer;  also  referred  to  as  automatic  speech  recognition  (ASR) 
[LeFever  87] 

Continuous  Speech  Recognition:  the  process  of  extracting  information  from 
strings  of  words  even  though  the  words  run  together  as  in  natural  speech.  [Yeller 
83] 

Discrete  (Isolated)  Speech  Recognition:  the  process  of  transforming  discrete 
utterances  (those  with  a  significant  pause  between  utterances)  into  computer- 
recognized  speech  or  text. 

Utterance  (Word):  may  be  a  single  mono-  or  polysyllabic  word  (e.g.,  select)  or 
a  combination  of  mono-  or  polysyllabic  words  joined  into  a  phrase  (e.g.,  select-the- 
first-choice). 
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Rejection:  the  inability  of  a  recognizer  to  classify  an  utterance  correctly. 
[Yellen83] 

Misrecognition:  classification  by  a  recognizer  of  an  utterance  as  something 
other  than  what  was  spoken. 

Speaker  Dependent  Systems:  require  adaptation  (or  "training")  of  the  voice 
recognition  s>stem  to  the  speech  characteristics  of  each  user  in  order  to  achieve 
recognition. 

Speaker  Independent  Systems:  recognize  speech  regardless  of  the  speaker,  and 
without  system  training  in  recognition  of  individual  speech  characteristics  of  users. 
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Automatic  Speech  Recognition  (ASR) 
Automatic  Word  Recognition  (AWR) 
Continuous  Recognizer 
Decision  Support  System  (DSS) 
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problems,  and  leads  to  user  satisfaction. 
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It  is  demonstrated  in  this  paper  that  a  real-time,  large-vocabulary,  isolated- 
word  speech  recognition  system  can  effectively  be  implemented  using  the 
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The  above  system  has  been  implemented  in  a  minicomputer  environment. 
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This  thesis  investigates  spieech  recognition  in  a  command  and  control 
workstation  environment.  It  discusses  the  Navy's  need  for  a  command  and 
control  workstation  (CCWS)  and  the  importance  of  the  human  interface 
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n.  4,  pp.  309-316, 1  December  1987. 

The  present  study  uses  a  range  of  speech  intelligibility  measures  to  examine 
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Computer  World,  v.  18,  pp.  73-74,  6  February  1984. 

Focuses  on  speech  systems,  and  predicts  that  they  are  likely  to  become  the 
major  new  I/O  device  of  microcomputing  in  the  middle  to  late  1980s. 
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for  general  office  use;  in  the  meantime,  voice  systems,  with  all  their 
problems,  are  being  used  in  a  variety  of  applications  now. 
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extended  to  the  recognition  of  connected  speech.  It  is  concluded  that, 
although  current  automatic  speech  recognition  algorithms  are  still  relatively 
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A  high  performance,  flexible,  and  potentially  inexpensive  speech 
recognition  system  is  described  in  this  report.  The  system  is  based  on  two 
special-purpose  integrated  circuits  that  perform  the  speech  recognition 
algorithms  very  efficiently.  One  of  these  integrated  circuits  is  the  front-end 
processor.  It  computes  spectral  coefficients  from  incoming  speech, 
normalizes  these  spectra  and  fmds  the  start  and  end  of  words  in  the  speech. 
It  transmit  these  spectra  to  a  second  integrated  circuit  that  compares  them 
with  spectra  from  a  set  of  stored  word  templates.  The  system  can  compare 
an  input  word  with  one  thousand  word  templates  and  respond  to  a  user 
within  one  quarter  of  a  second.  The  system  normally  responds  to  words 
spoken  in  isolation  from  a  particular  speaker;  however  it  can  be  used  with 
connected  speech  as  well  as  in  a  speaker  independent  manner.  Modifying 
speech  recognition  algorithms  to  work  with  specially  designed  integrated 
circuits  is  shown  to  permit  even  high  performance  algorithms  to  be 
performed  inexpensively.  Using  techniques  such  as  these  speech  recognition 
devices  should  have  a  large  range  of  applications  within  the  next  few  years. 
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[Neil  81]  National  Technolo^  Information  Service  AD  A103280,  NPS- 

55-81-003,  Exarmnation  of  Voice  Recognition  System  to  Function  in  a 
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[Niemann  84]  Niemann,  H.,  and  others,  "The  Speech  Understanding  and 
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Local  spectral  distortion  measures  are  commonly  used  to  measure  the 
similarity  (or  spectral  distance)  between  two  given  short-time  spectra.  In 
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Itakura-Saito  (IS)  distortion  measure,  the  log  likelihood  ratio  (LLR) 
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4-talker,  telephone  recording  data  base.  The  results  can  be  summarized  as: 
(1)  All  LCP-based  distortion  measures  performed  reasonably  well.  The 
LLR  and  WSM  distortion  measures  gave  the  highest  recognition  accuracy, 
while  the  IS  distortion  measure  gave  the  lowest  score;  (2)  Whereas  the 
addition  of  suprasegmental  energy  information  helped  the  recognition 
performance,  the  use  of  gain  and  absolute  loudness  degraded  the 
performance;  (3)  Bark-scale  frequency  warping  did  not,  at  least  for  the 
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unwaiped  counteipart;  (4)  The  WLR  distortion  measure  did  not  perform  as 
well  as  its  unweighted  counterpart 
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from  the  Computer  Database),  p.  52,  July  1986. 
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Provides  a  bibliography  that  contains  citations  concerning  the  principals, 
designs,  development,  and  various  applications  of  computeri^  speech 
synthesis  and  speech  recognition. 
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1983.  ISBN  0-08-029357-3. 

[Paddock  83]  Paddock,  Harold  E.,  "Voice  Input:  A  Reality",  The  Internal 
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This  paper  discusses  the  factors  known  to  influence  the  performance  of 
automatic  speech  recognizers  and  describes  test  procedures  for 
characterizing  their  performance.  It  is  directed  toward  all  the  stakeholders 
in  the  speech  community  (researchers,  vendors,  and  users);  consequently, 
the  discussion  of  test  procedures  is  not  directed  toward  the  needs  of  specific 
users  to  demonstrate  the  performances  characteristics  of  any  specific 
algorithmic  approach  or  particular  product.  It  relies  significantly  on 
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any  segmentation  operation  or  the  speech/silence  distinction)  is  avoided: 


71 


usually  this  is  a  significant  source  or  errors.  Instead,  decisions  are  deferred 
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between  similar  words,  their  speaker  adaptation  capability,  and  the  ease  with 
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tested.  In  the  third  experiment  a  complete  1000-word  recognition  system 
was  developed  which  performed  the  segmentation,  the  classification  of 
consonant  clusters  and  vowels,  and  a  correction  of  recognition  errors  by  use 
of  a  phonetic  lexicon.  Demisyllables  segmentation  and  processing  have 
prov^  suitable,  especially  for  large  vocabularies. 
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More  on  Voice  Recognition",  Byte,  pp.  147-154,  February  1984. 

Recent  developments  raise  some  questions  about  perceived  industry  trends. 
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[Shapiro  85]  Shapiro,  S.  F.,  "Speech  Recognition  Produces  Natural 

Interface",  Computer  Design,  pp.  59-62,  March  1985. 

[Shore  83]  Shore,  J.  E.  Burton,  "Discrete  Utterance  Speech  Recognition 

Without  Time  Alignment",  IEEE  Transactions  in  Iirformation  Theory,  pp. 
472-491,  July  1983. 

[Silverman  85]  Silverman,  H.  F.,  "One  Architectural  Approach  for  Speech 
Recognition  Processors",  Algorithmically  Specialized  Parallel  Computers. 
pp.  129-148.  Academic  F^ess,  Inc.,  1985.  ISBN  0-12-654130-2. 

[Siroux85]  Siroux,  J.,  and  Gillet,  D.,  "A  System  for  Man-Machine 

Communication  Using  Speech",  Speech  Communication,  v.  4,  pp.289- 
315,  December  1985. 

KEAL  is  a  continuous  speech  recognition  system  developed  at  the  CNET 
laboratory  in  Lannion  (France).  Part  of  the  laboratory’s  current  work  aims 
at  extending  it  in  the  direction  of  a  speech-understanding  and  man-machine 
dialog  system.  A  question-answer-type  dialog  is  set  in  motion  in  order  to 
provide  the  user  with  information  (the  current  application  consists  in 
simulating  a  directory  inquiries  service).  This  paper  describes  how 
syntactic,  semantic,  and  pragmatic  knowledge  is  used  for  implementing  such 
a  dialog,  and  the  main  advantages  and  drawbacks  of  the  methods  chosen  are 
discussed.  Sentence  recognition  is  performed  by  a  left-to-right  bottom-up 
parser  by  means  of  a  semantic  context-free  grammar.  Using  a  method 
analogous  to  that  of  semantic  attributes,  the  parse-tree  is  then  interpreted  in 
order  to  obtain  a  semantic  structure  which  represents  the  information 
relevant  to  the  subsequent  dialog.  The  dialog  manager  uses  the  semantic 
structure  for  instantiating  a  model  graph,  which  represents  the  state  the 
dialog  at  any  instant;  it  indicates  the  next  message  to  be  sent  to  the  user,  and 
how  to  analyze  his  answer.  An  example  derived  from  the  directory 
inquiries  service  is  described. 

[Smith  83]  Smith,  F.  J.,  and  Linggard,  R.  J.,  "Information  Retrieval  by 

Voice  Input  and  Output",  Research  and  Development  in  Information 
Retrieval,  pp.  275-288,  Springer- Verlag  New  York,  Inc.,  1983.  ISBN  0- 
387-11978-7. 

[Smith  84]  Smith,  Emily  T.,  and  Harris,  Marilyn  A.,  "More  Than  a 

Whisper  of  Hope  for  Computers  You  Can  Talk  To",  Business  Week.  p.  92F- 
H,  17  December  1984. 

Examines  the  new  IBM  experimental  computer  which  has  a  system  capable 
of  recognizing  5,000  spoken  words  with  95%  accuracy. 
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[Spine  84]  Spine,  T.,  Williges,  B.  H.,  and  Maynard,  J.  F.,  "An  Economical 
Approach  to  Modeling  Speech  Recognition  Accuracy",  International 
Journal  of  Man-Machine  Studies ,  v.  21,  pp.  191-202,  September  1984. 

Accuracy  of  speech  recognizer  decisions  is  an  important  criterion  for 
maintaining  both  system  effectiveness  and  user  satisfaction.  A  central- 
composite  design  methodology  is  recommended  as  an  economical  means  to 
develop  empirical  prediction  equations  for  speech  recognizer  performance 
incorporating  a  number  of  influential  factors.  Factors  manipulated  in  the 
central-composite  design  included  number  of  training  passes,  reject 
threshold,  difference  score,  and  size  of  the  active  vocabulary.  The  factorial 
combination  of  two  noncontinuous  variables,  sex  of  the  sp^er  and  inter¬ 
word  confusability,  was  also  investigated  by  replicating  the  central- 
composite  design  to  create  four  sets  of  data.  Standard  least-squares  multiple 
regression  analysis  was  used  to  develop  the  four  sets  of  prediction  equations, 
each  of  which  accounted  for  at  least  50%  of  the  variance  in  recognizer 
performance.  A  cross-validation  study  revealed  that  shrinkage  was  not 
excessive.  Subsequendy,  these  empiric^  models  were  incorporated  into  an 
interactive  design  tool  for  a  dialogue  author  where  the  percentage  of  correct 
recognition  is  automatically  optimized  when  the  dialogue  author  enters  the 
size  of  the  vocabulary  to  be  used  or  both  the  vocabulary  size  and  desired 
number  of  training  passes.  The  design  tool  can  also  be  used  to  make 
predictions  anywhere  within  the  response  surface.  Use  of  these  efficient 
data  collection  procedures  along  with  the  interactive  design  tool  should 
greatly  assist  the  dialogue  author  in  predicting  the  impact  of  various 
language,  task,  environmental,  algorithmic,  human,  and  performance 
evaluation  factors  on  speech  recognition  accuracy. 

[Stephens  83]  Stephens,  Ron,  "Make  the  Way  for  Another  Revolution", 
Modern  Offices,  v.  28,  pp.  96+,  October  1983. 

Suggests  that  many  of  the  current  methods  of  communicating  and 
manipulating  information  which  have  traditionally  been  dependent  on 
keyboard  entry,  may  soon  be  replaced  by  voice-based  procedures,  causing  a 
major  transformation  with  the  automated  office. 

[Strat  Inc  81]  Voice  InputlOutput:  Markets,  Technologies  Applications  , 
p.  110,  Strategic  Inc.,  1981. 

Analyzes  the  advantages  of  voice  I/O,  states  of  the  market  technology  trends 
in  speech  synthesis,  future  applications,  voice  response,  text-to-voice, 
language  translations,  aids  to  handicapped  and  computer  output.  Electronic 
voice  mail,  dictation/word  processing,  computer  I/O  automation,  games, 
etc.,  also  are  included. 
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[Sweeney  86]  Sweeney,  M.  J.,  and  Bitar,  K.  J.,  An  Analysis  of  Friendly 
Input  Devices  for  the  Control  of  the  Naval  Warfare  Interactive  Simulation 
System,  Master's  Thesis,  Naval  Postgraduate  School,  Monterey,  California, 
March  1986.  AD  S9333. 

This  thesis  describes  an  experiment  conducted  at  the  Naval  Postgraduate 
School  (NPS)  during  the  period  IS  October  through  28  October  1985. 
Specifically,  the  experiment  evaluates  "pull-down  window"  micro-computer 
technology,  continuous  speech  recognition  equipment,  and  standard 
computer  keyboard  entry  to  input  commands  and  control  environment. 
Using  the  Naval  Warfare  Interactive  Simulation  System  (NWISS)  as  a 
controlled  medium,  military  problems  were  posed  to  test  subjects  in  specific 
light  and  noise  environments.  Although  the  results  are  not  entirely 
conclusive,  they  do  demonstrate  a  distinct  advantage  in  using  continuous 
speech  or  keyboard  entry  modes  over  the  drop-down  window  technology  of 
the  Macintosh  (if  subject  training  time  is  not  a  significant  restriction). 
Either  the  continuous  speech  or  the  keyboard  method  was  clearly  superior  in 
all  environments. 

[Taggart  81]  National  Technical  Information  Service  AD-A105  568,  Voice 

Recognition  as  an  Input  Modality  for  the  TACCO  Preflight  Data  Insertion 
Task  in  the  P~3C  Aircraft,  by  John  Laughlin  Taggart  and  Charles  Darwin 
Wolfe,  Jr.,  p.  150,  March  1981. 

Reports  the  results  of  an  experiment  to  compare  accuracy  and  entry  speed 
capabilities  of  a  standard  keyboard  with  the  Threshold  Technology  T-600 
voice  recognition  unit  in  the  performance  of  an  operational  data  entry  task 
in  the  P-3C  aircraft. 

[Tanaka  83]  Tanaka,  A.,  and  others,  "A  Study  of  the  Syllable  Oriented 

Recognition  of  Continuous  Speech",  Speech  Communication,  v.  2,  n.  2-3,  pp. 
207-210,  July  1983. 

[Taylor  86]  Taylor,  M.,  Voice  Input  Applications  in  Aerospace,  pp.  322- 

337,  McGraw-Hill  Inc.,  1986.  ISBN  0-07-007913-7. 

[Tecosky  86]  Tecosky,  T.,  Interfacing  Standards  for  Recognizers,  pp.  244- 

255,  McGraw-Hill  Inc.,  1986.  ISBN  0-07-007913-7. 

[Teja  83]  Teja,  E.  R.,  and  Gonnella,  G.,  Voice  Recognition  Technology, 

p.  212,  Reston  Publishing  Co.,  1983.  ISBN  0835984176. 
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[Thompson  84]  Thompson,  H.,  "Artificial  Intelligence  and  Speech  Processing: 
The  Good  News  and  the  Bad  News",  Proceedings  of  the  1st  International 
Conference  of  Speech  Technology ,  p.  217,  October  1984. 

Discusses  author's  expectations  about  the  contributions  we  can  and  cannot 
expect  from  Artificial  Intelligence  to  Speech  Processing  over  the  next  few 
years. 

[Thompson  85]  Thompson,  Linde,  "Voice  Recognition  Systems:  A  Sound 
Investment  in  the  Future",  News  34-38,  pp.  59+,  March  1985. 

Looks  at  the  present  and  the  future  uses  of  voice  recognition. 

[Tyler  86]  Tyler,  J.,  "Speech  Recognition  System  Using  Walsh  Analysis 

and  Dynamic  I^ogramming",  Microcomputers  &  Microsystems,  pp.  427- 
N433,  October  1986. 

[Underwood  84]  Underwood,  M.  J.,  "Human  Factors  Aspects  of  Speech 
Technology",  Proceedings  of  the  1st  International  Conference  of  Speech 
Technology  ,  p.  223,  October  1984. 

Regards  speech  technology  as  a  means  to  an  end,  and  not  an  end  in  itself. 
Discusses  the  human  component  in  the  speech  technology  system  and  its 
importance. 

[Viglione84]  Viglione,  S.  S.,  "Trends  in  Development  of  Speech 
Recognition  Systems",  Proceedings  of  the  1st  International  Conference  of 
Speech  Technology  ,  p.  169,  October  1984. 

Discusses  the  inherent  superiority  of  speech  over  other  modes  of  human 
communications  and  the  growing  need  for  better  control  of  complex 
machines.  Discusses  the  major  role  of  man-machine  communication 
through  the  use  of  speech  recognition  and  speech  response  systems. 

[Viglione  86]  Viglione,  S.,  Recognition  Past  and  Future,  pp.  373-387, 
McGraw-Hill  Inc.,  1986.  ISBN  0-07-007913-7. 

Discusses  the  inherent  superiority  of  speech  over  other  modes  of  human 
communication  and  the  growing  need  for  better  control  of  complex 
machines,  discusses  the  major  role  of  man-machine  command  through  the 
use  of  speech  recognition  and  speech  response  systems. 

[Visser  87]  Visser,  Roger,  "Voice  Recognition  Fills  Technical  Barriers", 

Manufacture  Engineering,  v.  98,  pp.  CT-24  to  CT-26,  May  1987. 
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Discusses  voice  recognition,  the  technology  which  allows  people  to  interact 
with  computers  using  voice  instead  of  keyboards  and  terminals  and  which 
has  been  successfully  implemented  by  numerous  manufacturers  from  steel 
and  car  makers  to  circuit  board  designers. 

[Wagner  87]  Wagner,  M.,  "A  Speech  Recognition  Experiment  With  the 

Entire  Syllable  Inventory  of  Standard  Chinese",  Speech  Communication,  v. 
6,  pp.  363-369,  i  December  1987. 

This  paper  explores  the  possibility  of  using  automatic  speech  recognition  as 
a  front  end  to  a  computer  for  Chinese  character  processing.  A  speech 
recognition  experiment  has  been  performed  with  the  complete  inventory  of 
second-tone  syllables  of  Standard  Chinese.  Two  recordings  of  this 
inventory,  which  were  made  48  hours  after  one  another,  were  used  as  test 
and  reference  sets.  It  is  shown  that  the  distribution  of  intrasyllable  distances 
and  the  distribution  of  intersyllable  distances  overlap  considerably  for  the 
full  inventory  of  260  second-tone  syllables.  The  recognition  rate  was 
determined  as  a  function  of  the  syllable  size  and  is  47.3%  for  the  complete 
syllable  inventory. 

[Watrous85]  Watrous,  Raymond,  "Speech  Input/Output;  Support  for 
Integration,"  Journal  of  Computer-Integrated  Manufacturing  Management, 
V.  1,  pp.  37-44,  Spring  1985. 

Describes  the  current  status  of  speech  I/O  technology  and  defines  some  of 
the  terminology  associated  with  the  technology  followed  by  a  discussion  of 
the  technology's  advantages  and  successful  use. 

[Wetterlind  86]  Wetterlind,  Peter  James,  "A  Speech  Error  Correction 
Algorithm  for  Natural  Language  Input  Processing",  Computer  Science,  v. 
17,  p.  300, 1986,  UMI  order  number;  AD  A86-25455. 

This  research  experiment  consisted  of  construction  of  a  system  for 
identifying  a  natural  language  sentence  using  only  speaker  independent 
phonemes  as  the  input.  The  motivating  hypothesis  for  the  experiment  is  that 
spoken  sentences  can  be  recognized  from  limited  phoneme  input.  The 
research  system  accepts  only  strings  of  consonant  phonemes,  which  are 
recognizable  in  a  spe^er  independent  environment.  The  original  'spoken' 
sentence  is  reproduced  from  the  consonant  phonemes  and  formatted  as  a 
word  sequence  for  subsequent  transmission  to  a  natural  language  processing 
system.  The  system  uses  a  vocabulary  of  general  words  and  an  expandable 
dictionary  of  domain  specific  words  during  the  sentence  recognition 
process. 
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[White  84]  White,  G.  M.,  "Speech  Recognition:  An  Idea  Whose  Time  is 

Coming",  213-225,  January  1984. 

Some  theoretical  and  practical  aspects  of  this  emerging  technology  are 
presented. 

[Wilson  84]  Wilson,  J.,  "Where  Do  We  Go  from  Here?",  Proceedings  of 

the  1st  International  Conference  of  Speech  Technology  ,  p.  181,  October 
1984. 

Discusses  the  background  and  evolution  of  future  speech  technology 
products  and  services. 

[Williams  85]  Williams,  John  M.,  "Computer  Knows  its  Programmer's 
Voice",  Government  Computer  News,  v.  4,  p.  32, 5  July  1985. 

Discusses  a  quadraplegic's  voice  recognition  system  which  allows  him  to 
perform  the  same  tasks  as  other  computer  programmers. 

[Withers  83]  Withers,  S.  J.,  "Voice  Control  of  an  Interactive  Simulation", 

Simulation,  pp.  28-29,  January  1983. 

A  low  cost,  microcomputer-based  voice  recognition  device  makes  a 
convenient  input  channel  for  an  interactive  model  of  a  manufacturing 
system.  The  problems  with  current  hardware  are  its  Umited  capabilities  and 
unreliable  operation.  However,  the  potential  exists  for  useful  voice  control 
of  simulations  in  the  near  future. 

[Wood  86]  Wood,  Lamont,  "Voices  in  the  Wilderness",  Computer 

Decisions,  v.  18,  pp.  34+,  8  April  1986. 

States  that  voice  recognition  is  a  long  way  from  becoming  a  widely  accepted 
office  technology  but,  nevertheless,  today's  voice  recognition  systems  do 
have  valuable  applications,  especially  on  the  shop  floor  and  in  the 
warehouse. 

[Woods  85]  Woods,  Tom,  "Computers  Learn  to  Listen",  Business 

Computer  Systems,  v.  4,  pp.  80-*-,  March  1985. 

Suggests  that  today" s  pioneering  speech  recognition  products  provide  a 
glimpse  of  the  exciting  technologies  and  diverse  business  applications  soon 
to  come. 

[Wyatt  85]  Wyatt,  Jim,  and  Elbon,  Dave,  "Computers  That  Listen  and 

Talk",  Cause! Effect,  v.  8,  pp.  9-»-,  July  1985. 


79 


Points  out  that  when  considering  voice  input/output,  the  terms  voice  storage 
and  playback,  voice  recognition,  and  voice  synthesis  can  be  used  to 
characterize  tasks  being  performed,  and  explains. 

[Yalabik84]  Yalabik,  N.,  and  Unal,  F.,  "An  Efficient  Algorithm  for 

Recognizing  Isolated  Turkish  Words",  New  Systems  and  Architectures  for 
Automatic  Speech  Recognition  and  Synthesis,  pp.  419-426, 2-14  July  1984 . 

[Yannakoudakis  85]  Yannakoudakis,  £.  J.,  "Voice  I/O:  Problems  and 
Perspectives",  Computer  Bulletin,  v.  1,  pp.10-12,  September  1985. 

Discusses  one  University's  approach  to  computer  voice  I/O  with  the  play¬ 
back  or  recognition  of  speech  units  through  the  application  of  rules  in  an 
algorithmic  manner.  4  references. 

|YelIen83]  Yellen,  H.  W.,  A  Preliminary  Analysis  of  Human  Factors 

Affecting  the  Recognition  Accuracy  of  a  Discrete  Word  Recognizer  for  C3 
System,  Master’s  Thesis,  Naval  Postgraduate  School,  Monterey,  California, 
March  1983.  AD  A128546. 

Literature  pertaining  to  voice  recognition  abounds  with  information 
relevant  to  the  assessment  to  transitory  speech  recognition  devices.  In  the 
past,  engineering  requirements  have  dictated  the  path  this  technology 
followed.  But,  other  factors  do  exist  that  influence  recognition  accuracy. 
This  thesis  explores  the  impacts  of  human  factors  on  the  successful 
recognition  of  speech,  principally  addressing  the  differences  or  variability 
among  users.  A  Threshold  Technology  T-6(X)  was  used  for  a  100  utterance 
vocabulary  to  test  44  subjects.  A  statistical  analysis  was  conducted  on  five 
generic  categories  of  human  factors:  occupational,  operational, 
psychological,  physiological,  and  personal.  How  the  equipment  is  trained 
and  the  experience  level  of  the  speaker  were  found  to  be  key  characteristics 
influencing  recognition  accuracy.  To  a  lesser  extent  computer  experience, 
time  of  week,  accent,  vital  capacity  and  rate  of  air  flow,  speaker 
cooperativeness,  and  anxiety  were  found  to  affect  overall  error  rate. 

[Zue  83]  Zue,  V.  W.,  "The  Use  of  Phonetic  Rules  in  Automatic  Speech 

Recognition",  Speech  Communication,  v.  2,  n.  2-3,  pp.  181-186,  July  1983. 

[Zue  84]  Zue,  V.  W.,  and  Huttenlochor,  D.  P.,  "Computer  Recognition 

of  Isolated  Words  from  Large  Vocabularies:  Lexical  Access  Using  Partial 
Phonetic  Information",  Institute  of  Irformation  Science,  pp.  343-3A1, 1984 
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[Bruce  82] 

[Ivall  86-2] 

[Peckham  83] 

[Cater  84] 

[Joost  83] 

[Peckman  86] 

[Cavazza  84] 

[Kurzweil  86] 

[Pfauth  83] 

[Cerf-Danon  87] 

[Lea  86] 

[Philip  87] 

[Qements  87] 

[LeFever  87] 

[Pieirel  87] 

[Cochran  83] 

[Leggett  82] 

[Pister-Bourjot  87] 

[Cole  85] 

[Llaurado  82] 

[Pluhar  83] 

[Conrad  83] 

[Martin  84] 

[Poock  80] 

[Dabbagh  86] 

[Martin  86] 

[Poock  83-2] 

[EDP  Anal  83] 

[Mascarenas  84] 

[Poock  83-3] 

[Elenius  86] 

[Meloni  83] 

[Poock  83-4] 

[Elster  80] 

[Menke87] 

[Poock  83-6] 

[Eskenazi  83] 

[Mokhoff84] 

[Poock  83-7] 

[FaUside  85] 

[Moody  85] 

[Poock  84] 

[Fallside  86] 

[Myers  83] 

[Prasad  87] 

[Ford  83] 

[Neil  81] 

[I^irsley  85] 

[Foster  82] 

[Niemann  85] 

[Rehsoft  84] 

[Friedman  84] 

[NnS  86-1] 

[Rollins  85] 

[Good  84] 

[NnS  86-2] 

[Ross  84] 

[GovDatSys  86] 

[NnS  86-3] 

[Salfer85] 

[Green  83] 

[NnS  86-4] 

[Santarelli  84] 

[Green  85] 

[NnS  87-1] 

[Schalk83] 

[HagCT  86] 

[O'NeU  82] 

[Schmandt  85] 

[Hobbs  84] 

[Ogozalek  86] 

[Seaman  82] 
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[Seaman  83] 

[Swe«iey  86] 

[Visser87] 

[Seaman  85] 

[Taylor  86] 

[Wagner  87] 

[Senensieb  84] 

[Tecosky  86] 

[Watrous  85] 

[Shapiro  84] 

[Teja  83] 

[White  84] 

[Shapiro  85] 

[Thompson  84] 

[Wood  86] 

[Siroux  85] 

[Thompson  85] 

[Woods  85] 

[Smith  83] 

[Underwood  84] 

[Wyatt  85] 

[Smith  84] 

[Viglione  84] 

[Yalabik84] 

[Stephens  83] 

fViglione  86] 

[Yellen83] 

SECTION  2.  MULTILINGUAL  FACTORS 

[Eskenazi  83] 

[Niemann  85] 

[Wagner  87] 

[Meloni  83] 

[Pister-Bourjot  87] 

[Yalabik  84] 

[NeU81] 

[Prasad  87] 

SECTION  3.  MULTICULTURAL  FACTORS 

[Eskenazi  83] 

[Ogozalek  86] 

[Salfer85] 

[Meloni  83] 

[Pister-Bourjot  87] 

[Wagner  87] 

[Neil  81] 

[Prasad  87] 

[Yalabik  84] 

[Niemann  85] 

SECTION  4.  COMMAND  AND  CONTROL  ENVIRONMENTS 

[Cerf-Danon  87] 

[Pfauth  83] 

[Poock  83-4] 

[Hobbs  84] 

[Pister-Bourjot  87] 

[Poock  83-6] 

[LeFever  87] 

[Pluhar  83] 

[Salfer85] 

[NeU  81] 

[Poock  80] 

[Sweeney  86] 

[Niemann  85] 

[Poock  83-2] 

[YeUen83] 

SECTION  5.  HIGH  NOISE  ENVIRONMENTS 

[Elster  80] 

[Pluhar  83] 

[Rehsoft  84] 

[Martin  84] 

[Poock  83-3] 

[Rollins  85] 

[Pfauth  83] 

[Poock  84] 

SECTION  6.  LOW-LIGHT  ENVIRONMENTS 
[SalferSS] 
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APPENDIX  C3  SITUATIONAL 


SECTION  1. 

SITUATIONAL  FACTORS 

[Bakst  87] 

[GovDatSys  86] 

[Blunden  80] 

[Green  83] 

[Bristow  86-1] 

[Green  85] 

[Bristow  86-2] 

[Hager  86] 

[Brown  87] 

[HiU86] 

[Bruce  82] 

[Hunter  85] 

[Cater  84] 

[Int  Res  Dev  85] 

[Cavazza  84] 

[Int  Res  Dev  87] 

[Cerf-Danon  87] 

[Ivall  86-1] 

[Oements  87] 

[Ivall  86-2] 

[Cochran  83] 

[Joost  83] 

[Cole  85] 

PCohonen  85] 

[Connolly  86] 

[Kurzweil  86] 

[Conrad  83] 

[Lea  86] 

[Dabbagh  86] 

[LeFever  87] 

[Damper  84] 

[Leggett  82] 

[Damper  85] 

[Llaurado  82] 

[EDPAnal83] 

[Maenobu  84] 

[Eleiiius  86] 

[Martin  86] 

[Elster  80] 

[Mascarenas  84] 

[Eskenazi  83] 

[Menke87] 

[Fallside  85] 

[Mokhoff841 

[Fallside  86] 

[Moody  85] 

[Fisher  86] 

[Myers  83] 

[Ford  83] 

[NeU  81] 

[Foster  82] 

[NHS  86-1] 

[Friedman  84] 

[NTIS  86-2] 

[Good  84] 

[NTIS  86-3] 

FACTORS 


[NnS  86-4] 

[Nns  87] 

[O’NeU  82] 
[Paddock  83] 
[PaUettSS] 

[Pallett  86] 
[Pearidns  84] 
[Peckham  83] 
[Peckman  86] 
[Philips?] 

[Pierrel  87] 
[Pister-Bourjot  87] 
[Pluhar  83] 
[PoockSO] 

[Poock  83-7] 
[Poock  84] 

[Ph-asad  87] 
[Pursley  85] 
[Rehsoft  84] 
[Salfer85] 
[Santarelli  84] 
[Schalk83] 
[Schimndt  85] 
[Seaman  82] 
[Seaman  83] 
[Seaman  85] 
[Senensieb  84] 
[Shapiro  84] 
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[Shapiro  85] 
[Siroux  85] 
[Smith  83] 
[Stephens  83] 
[Taylor  86] 
[Tecosky  86] 
[Teja  83] 


[Thompson  84] 
[Thompson  85] 
[Underwood  84] 
[Viglione  84] 
[Viglione  86] 
[Visser  87] 
[Watrous  85] 


[White  84] 
[Williams  85] 
[Wood  86] 
[Woods  85] 
[Wyatt  85] 
[YeUen83] 


USAGE 

[Poock  80] 
[Prasad  87] 
[Salfer85] 
[YeUen  83] 

SECTION  3.  INDIVIDUAL  USAGE 
[Pister-Bourjot  87] 

[Hill  86] 

SECTION  4.  HANDICAP  SITUATIONS 
[Damper  84]  [Fisher  86] 

[Damper  85]  [Kurzweil  86] 


SECTION  2.  MULTIUSER  OR  GROUP 


[Cerf-Danon  87] 
[Connolly  86] 
[Eskenazi  83] 
[Kohonen  85] 
[LeFever  87] 


[Maenobu  84] 

[Neil  81] 

[Pister-Bourjot  87] 
[Pluhar  83] 
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APPENDIX  C4  QUANTITATIVE  FACTORS 


SECTION  1.  QUANTITATIVE  FACTORS 


[Anatharaman  86] 

[Gould  83] 

[Moody  85] 

[Baker  84] 

[GovDatSys  86] 

[Myers  83] 

[Bisiani  84] 

[Green  83] 

[NnS  86-1] 

[Blunden  80] 

[Green  85] 

[NnS  86-2] 

[Bristow  86-1] 

[Gubrynowicz  84] 

[NnS  86-3] 

[Bristow  86-2] 

[Hager  86] 

[NTIS  86-4] 

[Brown  87] 

[Harrison  84] 

[NTIS  87-1] 

[Bruce  82] 

[Hill  86] 

[O’Neil  82] 

[Calcaterra  82] 

[Hobbs  84] 

[Paddock  83] 

[Cater  84] 

[Hunter  85] 

[Pallett  85] 

[Cavazza  84] 

[Int  Res  Dev  85] 

[Pallett  86] 

[Qements  87] 

[Int  Res  Dev  87] 

[Pearidns  84] 

[Cochran  83] 

[Ivall  86-1] 

[Peckham  83] 

[Cole  85] 

[Ivall  86-2] 

pPeckman  86] 

[Conrad  83] 

[Johnson  86] 

P>fauth  83] 

[Dabbagh  86] 

[Joost  83] 

[PhiUp  87] 

[Dillman  84] 

[Koelsch  87] 

[Pierrel  87] 

[EDPAnal  83] 

[Kurzweil  86] 

[Pluhar  83] 

[Elenius  86] 

[Lea  86] 

[Poock83-l] 

[Elster  80] 

[LeFever  87] 

[Poock  83-7] 

[Epstein  86] 

[Leggett  82] 

[Poock84] 

[Fallside  85] 

[Llaurado  82] 

[Poock  85] 

[Fallside  86] 

[Lombardo  84] 

[Pursley  85] 

[Ford  83] 

[Martin  84] 

[Reardon  87] 

[Foster  82] 

[Martin  86] 

[Rehsoft  84] 

[French  83] 

[Mascaienas  84] 

[Saitta83] 

[Friedman  84] 

[Meisel  84] 

[Santarelli  84] 

[Good  84] 

[Mokhoff84] 

[Schalk83] 
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[Schmandt  85] 

[Smith  83] 

[Underwood  84] 

[Scott  83] 

[Smith  84] 

[Viglione  84] 

[Seaman  82] 

[Stephois  83] 

[Viglione  86] 

[Seaman  83] 

[Sweeney  86] 

[Visser  87] 

[Seaman  85] 

[Taylor  86] 

[Watrous  85] 

[Senensieb  84] 

[Tecosky  86] 

[White  84] 

[Shapiro  84] 

[Teja  83] 

[Wood  86] 

[Shapiro  85] 

[Thompson  84] 

[Woods  85] 

[Siroux  85] 

[Thompson  85] 

[Wyatt  85] 

SECTION  2.  TIME 

[Anatharaman  86] 

[Dillman  84] 

[Hm86] 

[Brown  87] 

[Epstein  86] 

[Scott  83] 

SECTION  3.  ACCURACY 

[Calcaterra  82] 

[Elster  80] 

[Koelsch  87] 

[Dillman  84] 

[French  83] 

[Meisel84] 

SECTION  4.  SPEED  OF  ENTRY 

[Anatharaman  86] 

[Dillman  84] 

[Meisel84] 

[Bisiani  84] 

[HiU86] 

[Swe«iey  86] 

SECTION  5.  EASE  OF  USE 
[Epstein  86] 


SECTION  6.  PRODUCTIVITY 
[Hager  86] 

[Pfauth  83] 

[Reardon  87] 
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APPENDIX  C5  TRAINING 


SECTION  1.  TRAINING  FACTORS 


[Anisworth  84] 
[Baker  84] 
[Banatre  83] 
[Biermann  85-2] 
[Blunden  80] 
[Bridle  83] 
[Bristow  86-1] 
[Bristow  86-2] 
[Brown  87] 
[Bruce  82] 
[Calcaterra  82] 
[Cater  84] 
[Cavazza  84] 
[Cerf-Danon  87] 
[Qements  87] 
[Cochran  83] 
[Cole  85] 
[Connolly  86] 
[Conrad  83] 
[Cook  85] 
[Dabbagh  86] 
[Damper  85] 
[DeMori  84] 

[De  Mori  85-1] 
[DeMori  85-3] 
[DI  Martino  84] 
[EDPAnal83] 
[Elenius  86] 


[£1510*80] 
[Epstein  86] 
[Fallside  85] 
[Fallside  86] 

[Ford  83] 

[Foster  82] 
[French  83] 
[Friedman  84] 
[Frison  84-1] 
[Frison  84-2] 
[Good  84] 
(GovDatSys  86] 
[Green  83] 

[Green  85] 
[Gubrynowicz  84] 
[Hager  86] 
[Harrison  84] 
[Hobbs  84] 
[Howell  83] 

[Hunt  83] 

[Hunter  85] 

[Int  Res  Dev  85] 
[Int  Res  Dev  87] 
[Ivall  86-1] 

[IvaU  86-2] 
[Johnson  85] 
[Johnson  86] 

[Joost  83] 


FACTORS 


[Kurzweil  86] 

[Lea  86] 

[Leggett  82] 
[Levinson  86] 
[Llaurado  82] 
[Lombardo  84] 
[Longuet-Higgins  85] 
[Mackie  87] 
[Maenobu  84] 

[Martin  86] 
[Mascaienas  84] 
[Mavaddat  85] 
[Meade  85] 
[Meisel84] 

[Meloni  83] 

[Meloni  87] 

[Menke  87] 
[Mokhoff84] 

[Moody  85] 

[Moore  84-1] 

[Moore  84-2] 
[MyCTs83] 
[Nakagawa  84] 
[Niemann  85] 
[Nishida86] 
[Nocerino  85] 

[NrnS86-l] 

[NnS  86-2] 
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[NTIS  86-3] 

[Pursley  85] 

[Smith  84] 

[NnS  86-4] 

[Rehsoft  84] 

[Spine  84] 

[NnS  87-1] 

[Reuhkala  83] 

[Stephens  83] 

[O’NeU  82] 

[Roberts  86] 

[Sweeney  86] 

[Ogozalek  86] 

[Rollins  85] 

[Tanaka  83] 

[Osman  83] 

[Ross  84] 

[Taylor  86] 

[Paddock  83] 

[Rossi  83] 

[Tecosky  86] 

[PaUett85] 

[Salfer85] 

[Teja83] 

[Pallett  86] 

[Santarelli  84] 

[Thompson  84] 

[Pay  81] 

[Scagliola  83-2] 

[Thompson  85] 

[Peaikins  84] 

[Scagliola  84] 

[Underwood  84] 

[Peckham  83] 

[Scha]k83] 

[Viglione84] 

[Peckman  86] 

[Schmandt  85] 

[Viglione  86] 

[Philip  87] 

[Schotola  84] 

[Visser87] 

[Pierrel  87] 

[Scott  83] 

[Watrous  85] 

[Pister-Bourjot  87] 

[Seaman  82] 

[Wetterlind  86] 

[Pluhar  83] 

[Seaman  83] 

[White  84] 

[Poock  81-1] 

[Seaman  85] 

[Williams  85] 

[Poock  81-2] 

[Senensieb  84] 

[Wood  86] 

[Poock  83-1] 

[Shapiro  84] 

[Woods  85] 

[Poock  83-3] 

[Shapiro  85] 

[Wyatt  85] 

[Poock  83-5] 

[Shore  83] 

[YeUen83] 

[Poock  83-7] 

[Siroux  85] 

[Zue83] 

[Poock  84] 

[Smith  83] 

[Zue84] 

[Poock  85] 

SECTION  2.  SPEAKER  DEPENDENT  SYSTEMS 

[Cook  85] 

[Pister-Bourjot  87] 

[Epstein  86] 

[Rossi  83] 

SECTION  3.  SPEAKER  INDEPENDENT  SYSTEMS 

[Anisworth  84] 

[Maenobu84] 

[Pister-Bourjot  { 

[Connolly  86] 

[Menke87] 

[Rossi  83] 
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SECTION  4.  CONTINUOUS  SPEECH  RECOGNITION 


[Banatre  83] 

[Lombardo  84] 

[Osman  83] 

[Bridle  83] 

[Maenobu84] 

[Pay  81] 

[ConnoUy  86] 

[Meisel84] 

[Poock85] 

[Dc  Mori  85-3] 

[Melon!  87] 

[Ross  84] 

[DI  Martino  84] 

[Moore  84-1] 

[Rossi  83] 

[Prison  84-1] 

[Moore  84-2] 

[Tanaka  83] 

[Prison  84-2] 

[Nakagawa  84] 

[Zije83] 

[Hunt  83] 

[Ni^nann  85] 

SECTION  5.  DISCRETE  SPEECH  RECOGNITION 
[Rrench  83] 

[Reuhkala  83] 

[Shore  83] 

SECTION  6.  RECOGNITION  ACCURACY 


[Calcaterra  82] 

[Meade  85] 

[ScagUola84] 

[Elster  80] 

[Melon!  83] 

[Schotola  84] 

[Prench  83] 

[Nishida86] 

[Scott  83] 

[Gubrynowicz  84] 

[Nocerino  85] 

[Smith  84] 

[HoweU83] 

[Poock81-l] 

[Spine  84] 

[Levinson  86] 

[Poock  85] 

[Tanaka  83] 

[Longuet-Higgins  85] 

[Roberts  86] 

[Wetterlind  86] 

[Mackie87] 

[Rollins  85] 

[YeUen83] 

[Maenobu  84] 

[Scagliola83-2] 

[Zue83] 

[Mavaddat  85] 
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APPENDIX  C6  HOST  COMPUTER  FACTORS 


SECTION  1.  HOST  COMPUTER  FACTORS 


[Armstrong  80] 

[Ford  83] 

[Martin  86] 

[Bakst  87] 

[Foster  82] 

[Mascaienas  84] 

[Banatre  83] 

[Friedman  84] 

[Meisd84] 

[Blunden  80] 

[Good  84] 

[MenkB87] 

[Bridle  87] 

[Gould  83] 

[Mod  Mat  83] 

[Bristow  86-1] 

[GovDatSys  86] 

[Mokhoff84] 

[Bristow  86-2] 

[Green  83] 

[Moody  85] 

[Brown  87] 

[Green  85] 

[Murveit  83] 

[Bruce  82] 

[Haas  84] 

[Myers  83] 

[Calcaterra  82] 

[Hager  86] 

[NHS  86-1] 

[Cashen  86] 

[K1186] 

[NHS  86-2] 

[Cater  84] 

[Hunter  85] 

[NnS  86-3] 

[Cavazza  84] 

[IntResDev85] 

[NHS  86-4] 

[Qements  87] 

(Int  Res  Dev  87] 

[NHS  87-1] 

[Cochran  83] 

[Ivall86-1] 

[O’NeU  82] 

[Cole  85] 

[Ivall  86-2] 

[Ogozalek  86] 

[Conrad  83] 

[Jinper  85] 

[Paddock  83] 

[Cook  85] 

[Joost  83] 

[Pallett  85] 

[Dabbagh  86] 

[KeUer85] 

[Pallett  86] 

[De  Mori  85-2] 

[Koelsch  87] 

[Pearkins  84] 

[De  Mori  85-3] 

[Korzeniowski  86] 

[Peckham  83] 

[Dillman  84] 

[Kurzweil  86] 

[Pecknnan86] 

[EDPAnal83] 

[Lea  86] 

[Philip  87] 

[Elenius  86] 

[Leggett  82] 

[Pierrel  87] 

[Elsto*  80] 

[Llaurado  82] 

[Pluhar83] 

[Epstein  86] 

[Lombardo  84] 

[Poock80] 

[Fallside  85] 

[Madron  84] 

[Poock83-7] 

[FaUside  86] 

[Marian]  83] 

[Pursley  85] 
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[Rehsoft  84] 

[Shapiro  85] 

[Undowood  84] 

[RigoU  84] 

[Silverman  85] 

[Viglione  84] 

[Rigsby  82] 

[Siroux  85] 

[VigUone86] 

[Santarelli  84] 

[Smith  83] 

[VissCT  87] 

[Schalk83] 

[Stephens  83] 

[Walrous  85] 

[Schmandt  85] 

[Sweeney  86] 

[White  84] 

[Seaman  82] 

[Taylor  86] 

[Wood  86] 

[Seaman  83] 

[Tecosky  86] 

[Woods  85] 

[Seaman  85] 

[Teja  83] 

[Wyatt  85] 

[Senensieb  84] 

[Thompson  84] 

[Zije83] 

[Shapiro  84] 

[Thompson  85] 

SECTION  2.  MICROCOMPUTERS 

[Calcaterra  82] 

[Haas  84] 

[Lombardo  84] 

[Dabbagh  86] 

[Hill  86] 

[Madron  84] 

[Elenius  86] 

[Jinper  85] 

[Mariani  83] 

[Epstein  86] 

[KeUer85] 

[Murveit  83] 

[Friedman  84] 

[Koelsch  87] 

[Rigsby  82] 

[Good  84] 

[Korzeniowsld  86] 

[SweCTcy  86] 

SECTION  3.  MAINFRAMES 

[Calcaterra  82] 

[Cashen  86] 

SECTION  4.  NETWORKS 

[Banatre  83] 

[De  Mori  85-3] 

[Bridle  87] 

[Poock  80] 

SECTION  5.  TYPE  OF  ENTRY  REQUIRED 

[Armstrong  80] 

[Cook  85] 

[MBisd84] 

[Bakst  87] 

[HiU86] 

[Pluhar  83] 

[Cochran  83] 

[Koelsch  87] 
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APPENDIX  C7  EXPERIMENTS  AND  RESEARCH 


SECTION  1.  EXPERIMENTS  AND  RESEARCH 


[AUen83] 

[Cavazza  84] 

[Friedman  84] 

[Anatharaman  86] 

[Cerf-Danon  87] 

[Rison  84-1] 

[Andrews  84] 

[Qemaits  87] 

[Erison  84-2] 

[Anisworth  84] 

[Cochran  83] 

[Good  84] 

[Armstrong  81] 

[Cole  85] 

[Gould  83] 

[Baker  84] 

[Connolly  86] 

[GovDatSys  86] 

[Bakst  87] 

[Conrad  83] 

[Green  83] 

[Banatre  83] 

[Cook  85] 

[Green  85] 

[Berman  84] 

[Dabbagh  86] 

[Gubrynowicz  84] 

[Betterton  83] 

[Damper  84] 

[Haas  84] 

[Bierfert  85] 

[Damper  85] 

[Hager  86] 

[Biermann  84] 

[De  Mori  84] 

[Haton85] 

[Biermann  85-1] 

[DeMori  85-1] 

[Halon87] 

[Biermann  85-2] 

[De  Mori  85-2] 

[Henkle83] 

[Bisiani  84] 

[DeMori  85-3] 

[Hill  86] 

[Blunden  80] 

[De  Mori  87-1] 

[Hobbs  84] 

[Bridle  82] 

[DeMori  87-2] 

[HoweU83] 

[Bridle  83] 

[DiUman  84] 

[Hunt  83] 

[Bridle  84] 

[DI  Martino  84] 

[Hunter  85] 

[Bridle  87] 

[EDP  Anal  83] 

[Int  Res  Dev  80] 

[Bristow  86-1] 

[Elenius  86] 

[Int  Res  Dev  85] 

[Bristow  86-2] 

[Elsta’80] 

[Int  Res  Dev  87] 

[Bronson  85] 

[Epstein  86] 

avaU  86-1] 

[Brown  87] 

[Eskenazi83] 

[Ivall  86-2] 

[Bruce  82] 

[Fallside  85] 

[Jinpcr  85] 

[Calcaterra  82] 

[Fallside  86] 

[Johnson  85] 

[Cashen  86] 

[Ford  83] 

[Johnson  86] 

[Cater  84] 
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APPENDIX  E  CURRENT  VOICE  RECOGNITION  SYSTEMS 


AMDAHL  COMMUNICATIONS  SYSTEMS  DIVISION 
2200  N.  Greenville 
Richardson,  Texas  7S081 
(214)  699-9500 

-45  Two  Channel:  Voice  Input/Output  (for  PBX) 

ARTICULATE  SYSTEMS  INCORPORATED 
2380  Ellsworth  St. 

Berkley,  California  94704 
(415)  549-1013 

Voice  Navigator  SR  System  (for  MAC  +,  SE,  II) 


AT&T  CONVERSANT  SYSTEMS 
6200  E.  Broad  St. 

Columbus,  Ohio  43213 
(614)860-3836  or  1(800)341-2272 

Conversant  Ktm)  Voice  System:  Voice  Input/Output  (for  Touchtone 
data  input) 


CONVERGENT  TECHNOLOGIES,  INC. 

2700  N.  First  St.,  P.O.  Box  6685 
San  Jose,  California  95150-6685 
(408)434-2848 

Voice  Master:  Voice  Input/Output  (for  Convergent  Technologies;  The 
Bell  21 2- A;  AT&T  Dimension;  Rolm  CBX;  Northern  Telecom  SL-1) 


COVOX,  INC. 

675  D  Conger  St. 
Eugene,  Oregon  97402 
(503)342-1271 


Voice  Master:  Voice  Input/Output  (for  IBM;  Connmodore  64;  Apple  & 
Atari) 


DRAGON  SYSTEMS  INCORPORATED 
Chapel  Bridge  Park 
55  Chapel  St. 

Newton,  Massachusetts  02158 
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KEYTRONIC  CORPORATION 
P.O.  Box  14687 
Spokane,  Washington  99214 
1(800)262-6006 

Kevtronics  Model  KB5152V  (for  BM  PC  XT) 

HARRIS/LANER 
1700  ChantiUy  Dr.,  NE 
Atlanta,  Georgia  30324 
(404)329-8000  or  1(800)241-1706 

System  IV(Digital  Dictation^:  Voice  Input  (for  Lanier  Business  Products 

HEWLET-PACKARD  CO. 

3(XX)  Hanover  St. 

Palo  Alto,  California  94304 
(415)857-1501 

Office  Talk:  Voice  Input/Output  (for  BM,  HP  Vectra) 

BM  (INTERNATIONAL  BUSINESS  MACHINES) 

Old  Orchard  Rd. 

Armonk,  New  York  10504 
(914)765-1900 

Juniper  11:  Voice  Input/Output  (for  BM) 

Model  6294-771:  Voice  Input/Output  (for  BM) 

PS/2  Speech  Adapter:  Voice  Input^utput  (for  BM) 

INERSTATE  VOICE  PRODUCTS 
1849  W.  Sequoia  Ave. 

Orange,  California  92668 
(714)937-9010 

CSRB240(Connected  Speech  Recognition  Board):  Voice  Input  (for  BM) 
LC-SRB  (Low-Cost  Sp^h  Recognition  Boardk  Voice  Input  (for  BM) 
Systems  300:  Voice  Input  (for  RS  2320 

S4QQ0  f  Continuous  Speech  Voice  Data  Entry  Peripheral):  Voice  Input 
(for  BM) 

VocaLink  Cellular  Module:  Voice  Input  (for  Mitsubishi;  OKI;  NEC) 
VRC008:  Voice  Input  (for  TTL) 

VRT  300:  Voice  Input  (for  DEC;  CIE  Terminals;  Plesscy  Peripheral; 
RS-232C) 
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KURZWEIL  APPLIED  INTELLIGENCE,  INC. 

41 1  Waverley  Oaks  Rd. 

Waltham,  Massachusetts  02154-8465 
(617)893-5151 

KVS  (KlinwgiLyfliCfiSO^SlSms):  Voice  Input  (for  IBM  PC,  XT,  AT) 

KVT  (Kurzweil  Voiceterminal):  Voice  Input  (for  DEC;  IBM;  Hewlett- 
Packard) 

MICROLOG  CORP. 

20270  Goldenrod  Lane 
Germantown,  Maryland  20874 
(301)428-3227  or  1(800)635-3355 

VoiceConnect  System  3000. 3500:  Voice  Input/Output  (for  IBM  PC,  XT, 
AT) 

MICROPHONICS  TECHNOLOGY  CORP. 

25  37th  St.  NE,  Suite  B 
Auburn,  Washington  98002 
(206)939-2321  or  1(800)325-9206 

Pronounce  Voice  Control  System:  Voice  Input  (for  IBM) 

MIMIC,  INC. 

P.O.  Box  705 

Islington,  Massachusetts  02090-0705 
(617)329-9593 

Mimic  Speech  Processor:  VOIS  (Voice  Output  for  Industrial  Systems): 
Voice  Input/Output  (for  OEM;  Microcomputer) 

NEC  AMERICA,  INC. 

Radio  &  Transmission  Division 
2740  Prosperity  Ave. 

Fairfax,  Virginia  22031 
(703)698-5540 

AR-10:  Voice  Input/Output  (for  IBM) 

DP-200:  Voice  Input  (for  RS-232C;  RS-422;  IEEE-48;  20MA  Current  loop) 
S  AR-10:  Voice  Input/Output  (for  IBM) 

SR-10:  Voice  Input  (for  RS-232C) 

SR-100:  Voice  Input  (for  RS-232C;  NEC) 
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PERIPHONICS  CORP. 

4000  Veterans  Memorial  Hwy. 

Bohemia,  New  York  11716 
(516)467-0500 

TeleMarketer:  Voice  Input  (for  CDC;  DG;  DEC;  HIS;  IBM;  NCR;  Unisys; 
Wang;  PABX;  ACD) 

VoicePac  Announcement  System:  Voice  Input/Output  (for  CDC;  DG; 
DEC;  HIS;  IBM;  NCR;  Unisys;  Wang;  PABX;  ACD) 

SCOTT  INSTRUMENTS  CORP. 

1111  Willow  Springs  Dr. 

Denton,  Texas  76205 

(817) 387-9514 

Coretechs  VET-3  Voice  Entry  Terminal:  Voice  Input/Output  (for  RS- 
232C) 

ShadowA^T  Voice  Entry  Terminal:  Voice  Input  (for  Apple) 

VET-2  Voice  Entry  Terminal:  Voice  Input  (for  Apple) 

SHURE  BROTHERS,  INC. 

222  Hartrey  Ave. 

Evanston,  Illinois  60202-3696 
(312)866-2200 

SMIO  Headset  Microphone:  Voice  Input  (for  OEM) 

VR  230  Two  Way  Headset:  Voice  Input/Output  (for  OEM) 

VR300  Gooseneck  Microphone:  Voice  Input  (for  OEM) 

503BG  Close-Talk  Microphone:  Voice  Input  (for  OEM) 

512  Two  Wav  Headset:  Voice  Input/Output  (for  OEM) 

SPEECH,  LTD. 

3790  El  Camino  Real,  Suite  213 
Palo  Alto,  California  94306 
(415)858-2207 

Protalker:  Voice  Input/Output  (for  IBM;  OEM;  Microcomputer) 

SPEECH  SYSTEMS,  INC. 

18356  Oxnard  St 
Tarzana,  California  91356 

(818) 881-0885 

DSIQQ  Phonetic  Engine:  Voice  Input  (for  RS-232C) 

PE20QPhonetic  Engine:  Voice  Input  (for  IBM;  RS232C) 
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SUDBURY  SYSTEMS,  INC. 

31  Union  Ave. 

Sudbury,  Massachusetts  01776 
(617)443-8966  or  1(800)245-7817 

RTAS:  Voice  Input/Oulput 

SUNCOAST  SYSTEMS,  INC. 

3100  McCormick  St, 

Suite  22,  P.O.  Box  7105 
Pensacola,  Florida  325 14 
(904)478-6477  or  1(800)843-9363 

Computerfone:  Voice  Input/Output  (for  OEM) 

TECMAR,  INC. 

6225  Cochran  Rd. 

Solon,  Ohio  44139 
(216)349-1009 

Voice  Recognition  Board:  Voice  Input  (for  IBM  PC) 

TEXAS  INSTRUMENTS,  INC. 

P.O.  Box  655012 
Dallas,  Texas  75265 
1(800)527-3500 

Speech  Command  System:  Voice  Input/Output  (for  IBM;  TI) 

VOICE  COMPUTER  TECHNOLOGIES  CORP. 

5730  Oakbrook  Pkwy 
Norcross,  Georgia  30093-1888 
(404)441-2303 

VCT  Series  2000  Model  2016:  Voice  Input/Output  (for  CDC;  DG;  DEC 
HIS;  IBM;  NCR;  Unisys;  Microcomputer) 

THE  VOICE  CONNECTION 
17835  Sky  Park  Circle,  Suite  C 
Irvine,  California  92714 
(714)261-2366 

IntroVoice  I:  Voice  Input  (for  Apple  H,  Apple  He;  RS-232C) 
IntroVoice  11:  Voice  Input  (for  Apple) 

IntroVoice  HI:  Voice  Input  (for  IBM  PC,  XT,  AT) 
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IntroVoice  V:  Voice  Input  (for  IBM;  Compaq  386) 

IntroVoice  VI:  Voice  Input  (for  IBM  PS/2,  PC,  XT,  AT;  Compaq  386) 
PVDL  (Portable  Voice  Data  Logger’ll  Voice  Input/Output  (for  IBM) 
VMC  2020:  Voice  Input  (for  Apple  H,  He) 

VOICE  INDUSTRIES  CORP.  (VERBEX) 

10  Madison  Ave. 

Morristown,  New  Jersey  07960 
(201)267-7505 

Series  4000:  5000:  Voice  Input  (for  RS-232C) 

VOTAN 

4487  Technology  Dr. 

Fremont,  California  94538 
(415)490-7600 

Voice  Management  System:  Voice  Input/Output  (for  RS-232C; 
Centronics  parallel) 

Votan  Voice  Card  (Board  LeveH:  Voice  Input/Output  (for  IBM) 

VSP  IQQQ  (Board  Level:  Voice  Input/Output  (for  IEEE-786) 

VTR  3270:  Voice  Input/Output  (for  IBM;  Coax 
VTR-6050  Series  11:  Voice  taput/Output  (for  RS-232C) 

VYNET  CORP. 

180  Knowles  Dr. 

Los  Gatos,  California  95030 
(408)370-0555;  (408)370-9764;  or 
1(800)538-7002 

V2100  Telephone  Voice  Response  System:  Voice  Input/Output  (for  IBM) 
V2301/V1202/V2202  Telephone  Speech  Digitizer  &  Playback: 

Voice  Input/Output  (for  IBM) 

V4000  Telephone  Voice  Response  System:  Voice  Input/Output  (for  IBM) 

XTRA  BUSINESS  SYSTEMS 
2350  Qume  Dr. 

San  Jose,  California  95131 
(408)945-8950 

Voice  Communications  System:  Voice  Inpul/Ou5>ut  (for  XTRA  Series) 
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